{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "88791136",
   "metadata": {},
   "source": [
    "<table align=\"left\">\n",
    "  <td>\n",
    "    <a href=\"https://colab.research.google.com/github/nyandwi/machine_learning_complete/blob/main/6_classical_machine_learning_with_scikit-learn/10_intro_to_unsupervised_learning_with_kmeans_clustering.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>\n",
    "  </td>\n",
    "</table>"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b6ada71a",
   "metadata": {},
   "source": [
    "*This notebook was created by [Jean de Dieu Nyandwi](https://twitter.com/jeande_d) for the love of machine learning community. For any feedback, errors or suggestion, he can be reached on email (johnjw7084 at gmail dot com), [Twitter](https://twitter.com/jeande_d), or [LinkedIn](https://linkedin.com/in/nyandwi).*"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "7c5ba4fe",
   "metadata": {},
   "source": [
    "<a name='0'></a>\n",
    "# Intro to Unsupervised Learning - K-Means Clustering"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a15629b3",
   "metadata": {},
   "source": [
    "K-Means clustering is a type of unsupervised learning algorithms. In unsupervised learning, the machine learning model do not get the labels during training. It instead has to figure out the labels itself. It's like learning without instructions. It's like a teacher telling you, \"hey, here are 1000 exercises to use while preparing for a test, the test will be only 5 questions from all of those exercises.\" That can feel like a struggle, you will do all you can to narrow down these 100 exercises to 5. Some questions may be similar, or may be solved by one method, etc..the goal will merely be to narrow down the exercises, while maximizing the chance of passing the test. \n",
    "\n",
    "That type of example can be compared to clustering. The model is given bunch of data (lacking labels) and the job of the model is to find the labels that can be present according to the supplied data. \n",
    "\n",
    "\n",
    "K-Means Clustering require the number of clusters to be specified before training. The way this type of algorithm works is beyond the scope of this notebook but here are 3 main steps of how such algorithm work: \n",
    "\n",
    "* K-Means will randomly assigns samples of data to initial centroids of all clusters. This step is called initialization. A centroid is also referred to as a cluster center and it is the mean of all the sample of data in a cluster. \n",
    "\n",
    "* It then reassigns the samples to the nearest centroids.\n",
    "* It also find the new centroids of all clusters by taking the mean value of all of the samples assigned to each previous centroids. \n",
    "\n",
    "The last two steps are repeated until the stopping criterion is fulfilled or when difference between the old and new centroids is constant. \n",
    "\n",
    "Unspervised learning has got its applications in areas such as grouping web search results, customer segmentation, news aggregation and more. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d4e2c5a6",
   "metadata": {},
   "source": [
    "## KMeans Clustering"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ae5e8372",
   "metadata": {},
   "source": [
    "### Contents\n",
    "\n",
    "* [1 - Imports](#1)\n",
    "* [2 - Loading the data](#2)\n",
    "* [3 - Exploratory Analysis](#3)\n",
    "* [4 - Preprocessing the data](#4)\n",
    "* [5 - Trainin K-Means Clustering to Find Clusters](#5)\n",
    "* [6 - Evaluating K-Means Clustering](#6)\n",
    "* [7 - Final Notes](#7)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "99058d12",
   "metadata": {},
   "source": [
    "<a name='1'></a>\n",
    "## 1 - Imports"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "df278c5f",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "import seaborn as sns\n",
    "import sklearn\n",
    "import matplotlib.pyplot as plt\n",
    "%matplotlib inline"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "35699c38",
   "metadata": {},
   "source": [
    "<a name='2'></a>\n",
    "\n",
    "## 2 - Loading the data"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b0c6c4d7",
   "metadata": {},
   "source": [
    "In this notebook, we will use a different dataset. Up to this point creating these notebooks, my goal has been to look on the other side, to try something new, to try new a dataset. If you have went through some notebooks about other algorithms, no doubt that you have learned something new or perhaps nothing new but you experienced a new dataset. \n",
    "\n",
    "In this notebook, we will use a mushroom dataset. The dataset describes mushrooms in terms of their physical characteristics and they are classified into: poisonous or edible.\n",
    "\n",
    "The dataset also includes descriptions of hypothetical samples corresponding to 23 species of gilled mushrooms in the Agaricus and Lepiota Family. Each species is identified as definitely edible, definitely poisonous, or of unknown edibility and not recommended. This latter class was combined with the poisonous one. The Guide clearly states that there is no simple rule for determining the edibility of a mushroom; no rule like `leaflets three, let it be for Poisonous Oak and Ivy.`\n",
    "\n",
    "The dataset contains the labels (edibility) but for the purpose of doing clustering, we will remove the labels. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "67905ee6",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Let's firs hide warnings just in case\n",
    "\n",
    "import warnings\n",
    "warnings.filterwarnings('ignore')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "d3083946",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.datasets import fetch_openml\n",
    "\n",
    "mushroom_data = fetch_openml(name='mushroom', version=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "154ed67e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(8124, 22)"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mushroom_data.data.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b7a77ea3",
   "metadata": {},
   "source": [
    "As you can see above, there are 8124 examples and 22 features. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "69e438fb",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "**Author**: [Jeff Schlimmer](Jeffrey.Schlimmer@a.gp.cs.cmu.edu)  \n",
      "**Source**: [UCI](https://archive.ics.uci.edu/ml/datasets/mushroom) - 1981     \n",
      "**Please cite**:  The Audubon Society Field Guide to North American Mushrooms (1981). G. H. Lincoff (Pres.), New York: Alfred A. Knopf \n",
      "\n",
      "\n",
      "### Description\n",
      "\n",
      "This dataset describes mushrooms in terms of their physical characteristics. They are classified into: poisonous or edible.\n",
      "\n",
      "### Source\n",
      "```\n",
      "(a) Origin: \n",
      "Mushroom records are drawn from The Audubon Society Field Guide to North American Mushrooms (1981). G. H. Lincoff (Pres.), New York: Alfred A. Knopf \n",
      "\n",
      "(b) Donor: \n",
      "Jeff Schlimmer (Jeffrey.Schlimmer '@' a.gp.cs.cmu.edu)\n",
      "```\n",
      "\n",
      "### Dataset description\n",
      "\n",
      "This dataset includes descriptions of hypothetical samples corresponding to 23 species of gilled mushrooms in the Agaricus and Lepiota Family. Each species is identified as definitely edible, definitely poisonous, or of unknown edibility and not recommended. This latter class was combined with the poisonous one. The Guide clearly states that there is no simple rule for determining the edibility of a mushroom; no rule like ``leaflets three, let it be'' for Poisonous Oak and Ivy.\n",
      "\n",
      "### Attributes Information\n",
      "```\n",
      "1. cap-shape: bell=b,conical=c,convex=x,flat=f, knobbed=k,sunken=s \n",
      "2. cap-surface: fibrous=f,grooves=g,scaly=y,smooth=s \n",
      "3. cap-color: brown=n,buff=b,cinnamon=c,gray=g,green=r, pink=p,purple=u,red=e,white=w,yellow=y \n",
      "4. bruises?: bruises=t,no=f \n",
      "5. odor: almond=a,anise=l,creosote=c,fishy=y,foul=f, musty=m,none=n,pungent=p,spicy=s \n",
      "6. gill-attachment: attached=a,descending=d,free=f,notched=n \n",
      "7. gill-spacing: close=c,crowded=w,distant=d \n",
      "8. gill-size: broad=b,narrow=n \n",
      "9. gill-color: black=k,brown=n,buff=b,chocolate=h,gray=g, green=r,orange=o,pink=p,purple=u,red=e, white=w,yellow=y \n",
      "10. stalk-shape: enlarging=e,tapering=t \n",
      "11. stalk-root: bulbous=b,club=c,cup=u,equal=e, rhizomorphs=z,rooted=r,missing=? \n",
      "12. stalk-surface-above-ring: fibrous=f,scaly=y,silky=k,smooth=s \n",
      "13. stalk-surface-below-ring: fibrous=f,scaly=y,silky=k,smooth=s \n",
      "14. stalk-color-above-ring: brown=n,buff=b,cinnamon=c,gray=g,orange=o, pink=p,red=e,white=w,yellow=y \n",
      "15. stalk-color-below-ring: brown=n,buff=b,cinnamon=c,gray=g,orange=o, pink=p,red=e,white=w,yellow=y \n",
      "16. veil-type: partial=p,universal=u \n",
      "17. veil-color: brown=n,orange=o,white=w,yellow=y \n",
      "18. ring-number: none=n,one=o,two=t \n",
      "19. ring-type: cobwebby=c,evanescent=e,flaring=f,large=l, none=n,pendant=p,sheathing=s,zone=z \n",
      "20. spore-print-color: black=k,brown=n,buff=b,chocolate=h,green=r, orange=o,purple=u,white=w,yellow=y \n",
      "21. population: abundant=a,clustered=c,numerous=n, scattered=s,several=v,solitary=y \n",
      "22. habitat: grasses=g,leaves=l,meadows=m,paths=p, urban=u,waste=w,woods=d\n",
      "```\n",
      "\n",
      "### Relevant papers\n",
      "\n",
      "Schlimmer,J.S. (1987). Concept Acquisition Through Representational Adjustment (Technical Report 87-19). Doctoral disseration, Department of Information and Computer Science, University of California, Irvine. \n",
      "\n",
      "Iba,W., Wogulis,J., & Langley,P. (1988). Trading off Simplicity and Coverage in Incremental Concept Learning. In Proceedings of the 5th International Conference on Machine Learning, 73-79. Ann Arbor, Michigan: Morgan Kaufmann. \n",
      "\n",
      "Duch W, Adamczak R, Grabczewski K (1996) Extraction of logical rules from training data using backpropagation networks, in: Proc. of the The 1st Online Workshop on Soft Computing, 19-30.Aug.1996, pp. 25-30, [Web Link] \n",
      "\n",
      "Duch W, Adamczak R, Grabczewski K, Ishikawa M, Ueda H, Extraction of crisp logical rules using constrained backpropagation networks - comparison of two new approaches, in: Proc. of the European Symposium on Artificial Neural Networks (ESANN'97), Bruge, Belgium 16-18.4.1997.\n",
      "\n",
      "Downloaded from openml.org.\n"
     ]
    }
   ],
   "source": [
    "# Description of the data \n",
    "print(mushroom_data.DESCR)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "d7180991",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['cap-shape',\n",
       " 'cap-surface',\n",
       " 'cap-color',\n",
       " 'bruises%3F',\n",
       " 'odor',\n",
       " 'gill-attachment',\n",
       " 'gill-spacing',\n",
       " 'gill-size',\n",
       " 'gill-color',\n",
       " 'stalk-shape',\n",
       " 'stalk-root',\n",
       " 'stalk-surface-above-ring',\n",
       " 'stalk-surface-below-ring',\n",
       " 'stalk-color-above-ring',\n",
       " 'stalk-color-below-ring',\n",
       " 'veil-type',\n",
       " 'veil-color',\n",
       " 'ring-number',\n",
       " 'ring-type',\n",
       " 'spore-print-color',\n",
       " 'population',\n",
       " 'habitat']"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Displaying feature names\n",
    "\n",
    "mushroom_data.feature_names"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "eab24ee3",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "['class']"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Displaying target name\n",
    "\n",
    "mushroom_data.target_names"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "e3ba092e",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Getting the whole dataframe\n",
    "\n",
    "mushroom_data = mushroom_data.frame"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c492fd93",
   "metadata": {},
   "source": [
    "<a name='3'></a>\n",
    "## 3 - Exploratory Data Analysis\n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c895bd5e",
   "metadata": {},
   "source": [
    "### Taking a quick look into the dataset"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "f3eb141e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>cap-shape</th>\n",
       "      <th>cap-surface</th>\n",
       "      <th>cap-color</th>\n",
       "      <th>bruises%3F</th>\n",
       "      <th>odor</th>\n",
       "      <th>gill-attachment</th>\n",
       "      <th>gill-spacing</th>\n",
       "      <th>gill-size</th>\n",
       "      <th>gill-color</th>\n",
       "      <th>stalk-shape</th>\n",
       "      <th>...</th>\n",
       "      <th>stalk-color-above-ring</th>\n",
       "      <th>stalk-color-below-ring</th>\n",
       "      <th>veil-type</th>\n",
       "      <th>veil-color</th>\n",
       "      <th>ring-number</th>\n",
       "      <th>ring-type</th>\n",
       "      <th>spore-print-color</th>\n",
       "      <th>population</th>\n",
       "      <th>habitat</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>x</td>\n",
       "      <td>s</td>\n",
       "      <td>n</td>\n",
       "      <td>t</td>\n",
       "      <td>p</td>\n",
       "      <td>f</td>\n",
       "      <td>c</td>\n",
       "      <td>n</td>\n",
       "      <td>k</td>\n",
       "      <td>e</td>\n",
       "      <td>...</td>\n",
       "      <td>w</td>\n",
       "      <td>w</td>\n",
       "      <td>p</td>\n",
       "      <td>w</td>\n",
       "      <td>o</td>\n",
       "      <td>p</td>\n",
       "      <td>k</td>\n",
       "      <td>s</td>\n",
       "      <td>u</td>\n",
       "      <td>p</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>x</td>\n",
       "      <td>s</td>\n",
       "      <td>y</td>\n",
       "      <td>t</td>\n",
       "      <td>a</td>\n",
       "      <td>f</td>\n",
       "      <td>c</td>\n",
       "      <td>b</td>\n",
       "      <td>k</td>\n",
       "      <td>e</td>\n",
       "      <td>...</td>\n",
       "      <td>w</td>\n",
       "      <td>w</td>\n",
       "      <td>p</td>\n",
       "      <td>w</td>\n",
       "      <td>o</td>\n",
       "      <td>p</td>\n",
       "      <td>n</td>\n",
       "      <td>n</td>\n",
       "      <td>g</td>\n",
       "      <td>e</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>b</td>\n",
       "      <td>s</td>\n",
       "      <td>w</td>\n",
       "      <td>t</td>\n",
       "      <td>l</td>\n",
       "      <td>f</td>\n",
       "      <td>c</td>\n",
       "      <td>b</td>\n",
       "      <td>n</td>\n",
       "      <td>e</td>\n",
       "      <td>...</td>\n",
       "      <td>w</td>\n",
       "      <td>w</td>\n",
       "      <td>p</td>\n",
       "      <td>w</td>\n",
       "      <td>o</td>\n",
       "      <td>p</td>\n",
       "      <td>n</td>\n",
       "      <td>n</td>\n",
       "      <td>m</td>\n",
       "      <td>e</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>x</td>\n",
       "      <td>y</td>\n",
       "      <td>w</td>\n",
       "      <td>t</td>\n",
       "      <td>p</td>\n",
       "      <td>f</td>\n",
       "      <td>c</td>\n",
       "      <td>n</td>\n",
       "      <td>n</td>\n",
       "      <td>e</td>\n",
       "      <td>...</td>\n",
       "      <td>w</td>\n",
       "      <td>w</td>\n",
       "      <td>p</td>\n",
       "      <td>w</td>\n",
       "      <td>o</td>\n",
       "      <td>p</td>\n",
       "      <td>k</td>\n",
       "      <td>s</td>\n",
       "      <td>u</td>\n",
       "      <td>p</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>x</td>\n",
       "      <td>s</td>\n",
       "      <td>g</td>\n",
       "      <td>f</td>\n",
       "      <td>n</td>\n",
       "      <td>f</td>\n",
       "      <td>w</td>\n",
       "      <td>b</td>\n",
       "      <td>k</td>\n",
       "      <td>t</td>\n",
       "      <td>...</td>\n",
       "      <td>w</td>\n",
       "      <td>w</td>\n",
       "      <td>p</td>\n",
       "      <td>w</td>\n",
       "      <td>o</td>\n",
       "      <td>e</td>\n",
       "      <td>n</td>\n",
       "      <td>a</td>\n",
       "      <td>g</td>\n",
       "      <td>e</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 23 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "  cap-shape cap-surface cap-color bruises%3F odor gill-attachment  \\\n",
       "0         x           s         n          t    p               f   \n",
       "1         x           s         y          t    a               f   \n",
       "2         b           s         w          t    l               f   \n",
       "3         x           y         w          t    p               f   \n",
       "4         x           s         g          f    n               f   \n",
       "\n",
       "  gill-spacing gill-size gill-color stalk-shape  ... stalk-color-above-ring  \\\n",
       "0            c         n          k           e  ...                      w   \n",
       "1            c         b          k           e  ...                      w   \n",
       "2            c         b          n           e  ...                      w   \n",
       "3            c         n          n           e  ...                      w   \n",
       "4            w         b          k           t  ...                      w   \n",
       "\n",
       "  stalk-color-below-ring veil-type veil-color ring-number ring-type  \\\n",
       "0                      w         p          w           o         p   \n",
       "1                      w         p          w           o         p   \n",
       "2                      w         p          w           o         p   \n",
       "3                      w         p          w           o         p   \n",
       "4                      w         p          w           o         e   \n",
       "\n",
       "  spore-print-color population habitat class  \n",
       "0                 k          s       u     p  \n",
       "1                 n          n       g     e  \n",
       "2                 n          n       m     e  \n",
       "3                 k          s       u     p  \n",
       "4                 n          a       g     e  \n",
       "\n",
       "[5 rows x 23 columns]"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mushroom_data.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "2d70e72a",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>cap-shape</th>\n",
       "      <th>cap-surface</th>\n",
       "      <th>cap-color</th>\n",
       "      <th>bruises%3F</th>\n",
       "      <th>odor</th>\n",
       "      <th>gill-attachment</th>\n",
       "      <th>gill-spacing</th>\n",
       "      <th>gill-size</th>\n",
       "      <th>gill-color</th>\n",
       "      <th>stalk-shape</th>\n",
       "      <th>...</th>\n",
       "      <th>stalk-color-above-ring</th>\n",
       "      <th>stalk-color-below-ring</th>\n",
       "      <th>veil-type</th>\n",
       "      <th>veil-color</th>\n",
       "      <th>ring-number</th>\n",
       "      <th>ring-type</th>\n",
       "      <th>spore-print-color</th>\n",
       "      <th>population</th>\n",
       "      <th>habitat</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>8119</th>\n",
       "      <td>k</td>\n",
       "      <td>s</td>\n",
       "      <td>n</td>\n",
       "      <td>f</td>\n",
       "      <td>n</td>\n",
       "      <td>a</td>\n",
       "      <td>c</td>\n",
       "      <td>b</td>\n",
       "      <td>y</td>\n",
       "      <td>e</td>\n",
       "      <td>...</td>\n",
       "      <td>o</td>\n",
       "      <td>o</td>\n",
       "      <td>p</td>\n",
       "      <td>o</td>\n",
       "      <td>o</td>\n",
       "      <td>p</td>\n",
       "      <td>b</td>\n",
       "      <td>c</td>\n",
       "      <td>l</td>\n",
       "      <td>e</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8120</th>\n",
       "      <td>x</td>\n",
       "      <td>s</td>\n",
       "      <td>n</td>\n",
       "      <td>f</td>\n",
       "      <td>n</td>\n",
       "      <td>a</td>\n",
       "      <td>c</td>\n",
       "      <td>b</td>\n",
       "      <td>y</td>\n",
       "      <td>e</td>\n",
       "      <td>...</td>\n",
       "      <td>o</td>\n",
       "      <td>o</td>\n",
       "      <td>p</td>\n",
       "      <td>n</td>\n",
       "      <td>o</td>\n",
       "      <td>p</td>\n",
       "      <td>b</td>\n",
       "      <td>v</td>\n",
       "      <td>l</td>\n",
       "      <td>e</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8121</th>\n",
       "      <td>f</td>\n",
       "      <td>s</td>\n",
       "      <td>n</td>\n",
       "      <td>f</td>\n",
       "      <td>n</td>\n",
       "      <td>a</td>\n",
       "      <td>c</td>\n",
       "      <td>b</td>\n",
       "      <td>n</td>\n",
       "      <td>e</td>\n",
       "      <td>...</td>\n",
       "      <td>o</td>\n",
       "      <td>o</td>\n",
       "      <td>p</td>\n",
       "      <td>o</td>\n",
       "      <td>o</td>\n",
       "      <td>p</td>\n",
       "      <td>b</td>\n",
       "      <td>c</td>\n",
       "      <td>l</td>\n",
       "      <td>e</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8122</th>\n",
       "      <td>k</td>\n",
       "      <td>y</td>\n",
       "      <td>n</td>\n",
       "      <td>f</td>\n",
       "      <td>y</td>\n",
       "      <td>f</td>\n",
       "      <td>c</td>\n",
       "      <td>n</td>\n",
       "      <td>b</td>\n",
       "      <td>t</td>\n",
       "      <td>...</td>\n",
       "      <td>w</td>\n",
       "      <td>w</td>\n",
       "      <td>p</td>\n",
       "      <td>w</td>\n",
       "      <td>o</td>\n",
       "      <td>e</td>\n",
       "      <td>w</td>\n",
       "      <td>v</td>\n",
       "      <td>l</td>\n",
       "      <td>p</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>8123</th>\n",
       "      <td>x</td>\n",
       "      <td>s</td>\n",
       "      <td>n</td>\n",
       "      <td>f</td>\n",
       "      <td>n</td>\n",
       "      <td>a</td>\n",
       "      <td>c</td>\n",
       "      <td>b</td>\n",
       "      <td>y</td>\n",
       "      <td>e</td>\n",
       "      <td>...</td>\n",
       "      <td>o</td>\n",
       "      <td>o</td>\n",
       "      <td>p</td>\n",
       "      <td>o</td>\n",
       "      <td>o</td>\n",
       "      <td>p</td>\n",
       "      <td>o</td>\n",
       "      <td>c</td>\n",
       "      <td>l</td>\n",
       "      <td>e</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 23 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "     cap-shape cap-surface cap-color bruises%3F odor gill-attachment  \\\n",
       "8119         k           s         n          f    n               a   \n",
       "8120         x           s         n          f    n               a   \n",
       "8121         f           s         n          f    n               a   \n",
       "8122         k           y         n          f    y               f   \n",
       "8123         x           s         n          f    n               a   \n",
       "\n",
       "     gill-spacing gill-size gill-color stalk-shape  ...  \\\n",
       "8119            c         b          y           e  ...   \n",
       "8120            c         b          y           e  ...   \n",
       "8121            c         b          n           e  ...   \n",
       "8122            c         n          b           t  ...   \n",
       "8123            c         b          y           e  ...   \n",
       "\n",
       "     stalk-color-above-ring stalk-color-below-ring veil-type veil-color  \\\n",
       "8119                      o                      o         p          o   \n",
       "8120                      o                      o         p          n   \n",
       "8121                      o                      o         p          o   \n",
       "8122                      w                      w         p          w   \n",
       "8123                      o                      o         p          o   \n",
       "\n",
       "     ring-number ring-type spore-print-color population habitat class  \n",
       "8119           o         p                 b          c       l     e  \n",
       "8120           o         p                 b          v       l     e  \n",
       "8121           o         p                 b          c       l     e  \n",
       "8122           o         e                 w          v       l     p  \n",
       "8123           o         p                 o          c       l     e  \n",
       "\n",
       "[5 rows x 23 columns]"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Displaying the last rows \n",
    "\n",
    "mushroom_data.tail()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "afc96380",
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'pandas.core.frame.DataFrame'>\n",
      "RangeIndex: 8124 entries, 0 to 8123\n",
      "Data columns (total 23 columns):\n",
      " #   Column                    Non-Null Count  Dtype   \n",
      "---  ------                    --------------  -----   \n",
      " 0   cap-shape                 8124 non-null   category\n",
      " 1   cap-surface               8124 non-null   category\n",
      " 2   cap-color                 8124 non-null   category\n",
      " 3   bruises%3F                8124 non-null   category\n",
      " 4   odor                      8124 non-null   category\n",
      " 5   gill-attachment           8124 non-null   category\n",
      " 6   gill-spacing              8124 non-null   category\n",
      " 7   gill-size                 8124 non-null   category\n",
      " 8   gill-color                8124 non-null   category\n",
      " 9   stalk-shape               8124 non-null   category\n",
      " 10  stalk-root                5644 non-null   category\n",
      " 11  stalk-surface-above-ring  8124 non-null   category\n",
      " 12  stalk-surface-below-ring  8124 non-null   category\n",
      " 13  stalk-color-above-ring    8124 non-null   category\n",
      " 14  stalk-color-below-ring    8124 non-null   category\n",
      " 15  veil-type                 8124 non-null   category\n",
      " 16  veil-color                8124 non-null   category\n",
      " 17  ring-number               8124 non-null   category\n",
      " 18  ring-type                 8124 non-null   category\n",
      " 19  spore-print-color         8124 non-null   category\n",
      " 20  population                8124 non-null   category\n",
      " 21  habitat                   8124 non-null   category\n",
      " 22  class                     8124 non-null   category\n",
      "dtypes: category(23)\n",
      "memory usage: 188.0 KB\n"
     ]
    }
   ],
   "source": [
    "mushroom_data.info()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1891be01",
   "metadata": {},
   "source": [
    "All features are categorical. So we will make sure to handle them. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4de710df",
   "metadata": {},
   "source": [
    "### Checking Summary Statistics"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "753bb539",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>cap-shape</th>\n",
       "      <th>cap-surface</th>\n",
       "      <th>cap-color</th>\n",
       "      <th>bruises%3F</th>\n",
       "      <th>odor</th>\n",
       "      <th>gill-attachment</th>\n",
       "      <th>gill-spacing</th>\n",
       "      <th>gill-size</th>\n",
       "      <th>gill-color</th>\n",
       "      <th>stalk-shape</th>\n",
       "      <th>...</th>\n",
       "      <th>stalk-color-above-ring</th>\n",
       "      <th>stalk-color-below-ring</th>\n",
       "      <th>veil-type</th>\n",
       "      <th>veil-color</th>\n",
       "      <th>ring-number</th>\n",
       "      <th>ring-type</th>\n",
       "      <th>spore-print-color</th>\n",
       "      <th>population</th>\n",
       "      <th>habitat</th>\n",
       "      <th>class</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>count</th>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "      <td>...</td>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "      <td>8124</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>unique</th>\n",
       "      <td>6</td>\n",
       "      <td>4</td>\n",
       "      <td>10</td>\n",
       "      <td>2</td>\n",
       "      <td>9</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>2</td>\n",
       "      <td>12</td>\n",
       "      <td>2</td>\n",
       "      <td>...</td>\n",
       "      <td>9</td>\n",
       "      <td>9</td>\n",
       "      <td>1</td>\n",
       "      <td>4</td>\n",
       "      <td>3</td>\n",
       "      <td>5</td>\n",
       "      <td>9</td>\n",
       "      <td>6</td>\n",
       "      <td>7</td>\n",
       "      <td>2</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>top</th>\n",
       "      <td>x</td>\n",
       "      <td>y</td>\n",
       "      <td>n</td>\n",
       "      <td>f</td>\n",
       "      <td>n</td>\n",
       "      <td>f</td>\n",
       "      <td>c</td>\n",
       "      <td>b</td>\n",
       "      <td>b</td>\n",
       "      <td>t</td>\n",
       "      <td>...</td>\n",
       "      <td>w</td>\n",
       "      <td>w</td>\n",
       "      <td>p</td>\n",
       "      <td>w</td>\n",
       "      <td>o</td>\n",
       "      <td>p</td>\n",
       "      <td>w</td>\n",
       "      <td>v</td>\n",
       "      <td>d</td>\n",
       "      <td>e</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>freq</th>\n",
       "      <td>3656</td>\n",
       "      <td>3244</td>\n",
       "      <td>2284</td>\n",
       "      <td>4748</td>\n",
       "      <td>3528</td>\n",
       "      <td>7914</td>\n",
       "      <td>6812</td>\n",
       "      <td>5612</td>\n",
       "      <td>1728</td>\n",
       "      <td>4608</td>\n",
       "      <td>...</td>\n",
       "      <td>4464</td>\n",
       "      <td>4384</td>\n",
       "      <td>8124</td>\n",
       "      <td>7924</td>\n",
       "      <td>7488</td>\n",
       "      <td>3968</td>\n",
       "      <td>2388</td>\n",
       "      <td>4040</td>\n",
       "      <td>3148</td>\n",
       "      <td>4208</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>4 rows × 23 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "       cap-shape cap-surface cap-color bruises%3F  odor gill-attachment  \\\n",
       "count       8124        8124      8124       8124  8124            8124   \n",
       "unique         6           4        10          2     9               2   \n",
       "top            x           y         n          f     n               f   \n",
       "freq        3656        3244      2284       4748  3528            7914   \n",
       "\n",
       "       gill-spacing gill-size gill-color stalk-shape  ...  \\\n",
       "count          8124      8124       8124        8124  ...   \n",
       "unique            2         2         12           2  ...   \n",
       "top               c         b          b           t  ...   \n",
       "freq           6812      5612       1728        4608  ...   \n",
       "\n",
       "       stalk-color-above-ring stalk-color-below-ring veil-type veil-color  \\\n",
       "count                    8124                   8124      8124       8124   \n",
       "unique                      9                      9         1          4   \n",
       "top                         w                      w         p          w   \n",
       "freq                     4464                   4384      8124       7924   \n",
       "\n",
       "       ring-number ring-type spore-print-color population habitat class  \n",
       "count         8124      8124              8124       8124    8124  8124  \n",
       "unique           3         5                 9          6       7     2  \n",
       "top              o         p                 w          v       d     e  \n",
       "freq          7488      3968              2388       4040    3148  4208  \n",
       "\n",
       "[4 rows x 23 columns]"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Summary stats\n",
    "\n",
    "mushroom_data.describe()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "cac1f626",
   "metadata": {},
   "source": [
    "### Checking Missing Values"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "f0e4fb67",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "cap-shape                      0\n",
       "cap-surface                    0\n",
       "cap-color                      0\n",
       "bruises%3F                     0\n",
       "odor                           0\n",
       "gill-attachment                0\n",
       "gill-spacing                   0\n",
       "gill-size                      0\n",
       "gill-color                     0\n",
       "stalk-shape                    0\n",
       "stalk-root                  2480\n",
       "stalk-surface-above-ring       0\n",
       "stalk-surface-below-ring       0\n",
       "stalk-color-above-ring         0\n",
       "stalk-color-below-ring         0\n",
       "veil-type                      0\n",
       "veil-color                     0\n",
       "ring-number                    0\n",
       "ring-type                      0\n",
       "spore-print-color              0\n",
       "population                     0\n",
       "habitat                        0\n",
       "class                          0\n",
       "dtype: int64"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Checking missing values\n",
    "\n",
    "mushroom_data.isnull().sum()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "4977a1a1",
   "metadata": {},
   "source": [
    "It seems that we have missing values in the feature `stalk-root`. \n",
    "\n",
    "Usually there are three things to do with if them if they are present:\n",
    "* We can remove all missing values completely\n",
    "* We can leave them as they are or\n",
    "* We can fill them with a given strategy such as mean, media or most frequent value. Either `Sklearn` or Pandas provides a quick ways to fill these kind of values. \n",
    "\n",
    "We will handle that during the data preprocessing."
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6bb0fe21",
   "metadata": {},
   "source": [
    "### More Data Exploration\n",
    "\n",
    "Before preprocessing the data, let's take a look into specific features. \n",
    "\n",
    "I want to also make note that I do not know alot about mushrooms. I thought that it would be interesting to use this real world datasets, and perhaps some people who will come across this may some of mushroom samples and their characteristics. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "07aa6c1d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<AxesSubplot:xlabel='cap-shape', ylabel='count'>"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAtoAAAGpCAYAAACzsJHBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAgvElEQVR4nO3dfbSlZXkn6N8dwGAn0iJUDFJgoUMMqFCEEkw0xpFEkJUOCX4AJgIxHTRqG6cz9pjoKI2w0kZtx0TEwZZWFL/Q2DId04ZARpeOHxSx+A4NKEIxREpIUIMyiPf8cd4yR7oKzoHznF2n6rrW2uu8+36f9933qb1W8eOpZz+7ujsAAMDS+rFZNwAAANsjQRsAAAYQtAEAYABBGwAABhC0AQBggJ1n3cAoe+65Z69Zs2bWbQAAsB279NJLv9ndq7Z0brsN2mvWrMn69etn3QYAANuxqvr61s5ZOgIAAAMI2gAAMICgDQAAA2y3a7QBAFgZ7rnnnmzcuDHf+973Zt3KVu26665ZvXp1dtlllwVfMyxoV9U+Sc5N8ugkneTs7n57VT0qyUeSrElyY5IXdPc/VFUleXuSo5PcleTk7v7b6V4nJXnddOvTu/t9o/oGAGB5bdy4MY94xCOyZs2azEXCbUt35/bbb8/GjRuz3377Lfi6kUtHvp/kD7r7wCRPTfLyqjowyWuSXNTd+ye5aHqeJM9Jsv/0OCXJWUkyBfM3JDk8yWFJ3lBVuw/sGwCAZfS9730ve+yxxzYZspOkqrLHHnssesZ9WNDu7ls3z0h397eTXJNk7yTHJNk8I/2+JL8+HR+T5Nye88Ukj6yqvZIcmeTC7r6ju/8hyYVJjhrVNwAAy29bDdmbPZj+luXDkFW1JskhSb6U5NHdfet06u8zt7QkmQvhN8+7bONU21p9S69zSlWtr6r1mzZtWrpfAAAAFml40K6qn0zy8SSv6u5vzT/X3Z259dtLorvP7u513b1u1aotfkEPAADbiVNPPTVvectbZt3GVg0N2lW1S+ZC9nnd/edT+RvTkpBMP2+b6rck2Wfe5aun2tbqAACwzRoWtKddRN6T5Jru/o/zTl2Q5KTp+KQkn5xXP7HmPDXJndMSk08neXZV7T59CPLZUw0AgB3Iueeem4MOOigHH3xwXvSiF/3IuXe/+915ylOekoMPPjjPfe5zc9dddyVJzj///DzpSU/KwQcfnGc84xlJkquuuiqHHXZY1q5dm4MOOijXXXfdkH5Hzmg/LcmLkjyrqjZMj6OT/Ickv1JV1yX55el5knwqyVeTXJ/k3UleliTdfUeSNya5ZHqcNtUAANhBXHXVVTn99NNz8cUX57LLLsvb3/72Hzl/7LHH5pJLLslll12WAw44IO95z3uSJKeddlo+/elP57LLLssFF1yQJHnXu96V3//938+GDRuyfv36rF69ekjPw/bR7u7PJdnaxzOP2ML4TvLyrdzrnCTnLF13AACsJBdffHGe//znZ88990ySPOpRj/qR81deeWVe97rX5R//8R/zne98J0ceeWSS5GlPe1pOPvnkvOAFL8ixxx6bJPn5n//5nHHGGdm4cWOOPfbY7L///kN69hXsAACseCeffHLe8Y535Iorrsgb3vCGH+55/a53vSunn356br755hx66KG5/fbb88IXvjAXXHBBHv7wh+foo4/OxRdfPKQnQRsAgG3es571rJx//vm5/fbbkyR33PGjK4m//e1vZ6+99so999yT884774f1G264IYcffnhOO+20rFq1KjfffHO++tWv5nGPe1xe+cpX5phjjsnll18+pOdhS0cAAGCpPPGJT8xrX/va/NIv/VJ22mmnHHLIIVmzZs0Pz7/xjW/M4YcfnlWrVuXwww/Pt7/97STJq1/96lx33XXp7hxxxBE5+OCD86Y3vSnvf//7s8suu+Snf/qn80d/9EdDeq65pdHbn3Xr1vX69etn3QYAwJI69NXnzrqFRbn0zSc+4JhrrrkmBxxwwDJ089Bsqc+qurS7121pvKUjAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAA9hHGwCAbcpSb2G4kC0GRzCjDQAAAwjaAADs8D7wgQ/ksMMOy9q1a/OSl7wk995770O+p6ANAMAO7ZprrslHPvKRfP7zn8+GDRuy00475bzzznvI97VGGwCAHdpFF12USy+9NE95ylOSJN/97nfzUz/1Uw/5voI2AAA7tO7OSSedlD/+4z9e0vtaOgIAwA7tiCOOyMc+9rHcdtttSZI77rgjX//61x/yfc1oAwCwTVnu7fgOPPDAnH766Xn2s5+dH/zgB9lll11y5pln5rGPfexDuq+gDQDADu+4447Lcccdt6T3tHQEAAAGELQBAGAAQRsAAAYQtAEAYABBGwAABhC0AQBgANv7AQCwTbnptCcv6f32ff0VS3q/hTKjDQAAAwjaAADs8G688cb87M/+bH7zN38zBxxwQJ73vOflrrvuekj3FLQBACDJtddem5e97GW55pprsttuu+Wd73znQ7qfoA0AAEn22WefPO1pT0uS/NZv/VY+97nPPaT7CdoAAJCkqu73+WIJ2gAAkOSmm27KF77whSTJBz/4wTz96U9/SPezvR8AANuUWW3H94QnPCFnnnlmXvziF+fAAw/M7/3e7z2k+wnaAACQZOedd84HPvCBJbufpSMAADCAoA0AwA5vzZo1ufLKK5f0noI2AAAz192zbuF+PZj+hgXtqjqnqm6rqivn1T5SVRumx41VtWGqr6mq784796551xxaVVdU1fVV9af1UPdZAQBgm7Lrrrvm9ttv32bDdnfn9ttvz6677rqo60Z+GPK9Sd6R5NzNhe4+bvNxVb01yZ3zxt/Q3Wu3cJ+zkvxuki8l+VSSo5L85dK3CwDALKxevTobN27Mpk2bZt3KVu26665ZvXr1oq4ZFrS7+7NVtWZL56ZZ6Rckedb93aOq9kqyW3d/cXp+bpJfj6ANALDd2GWXXbLffvvNuo0lN6s12r+Y5Bvdfd282n5V9ZWq+kxV/eJU2zvJxnljNk61LaqqU6pqfVWt35b/jwgAgO3frIL2CUk+NO/5rUn27e5DkvzbJB+sqt0We9PuPru713X3ulWrVi1RqwAAsHjL/oU1VbVzkmOTHLq51t13J7l7Or60qm5I8jNJbkkyfzHM6qkGAADbtFnMaP9ykr/r7h8uCamqVVW103T8uCT7J/lqd9+a5FtV9dRpXfeJST45g54BAGBRRm7v96EkX0jyhKraWFW/M506Pj+6bCRJnpHk8mm7v48leWl33zGde1mS/5Tk+iQ3xAchAQBYAUbuOnLCVuonb6H28SQf38r49UmetKTNAQDAYL4ZEgAABhC0AQBggGXfdQQAgB3HTac9edYtLNq+r79iSe5jRhsAAAYQtAEAYABBGwAABhC0AQBgAEEbAAAGELQBAGAAQRsAAAYQtAEAYABBGwAABhC0AQBgAEEbAAAGELQBAGAAQRsAAAYQtAEAYABBGwAABhC0AQBgAEEbAAAGELQBAGAAQRsAAAYQtAEAYICdZ90AsOM69NXnzrqFRbv0zSfOugUAVggz2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAPY3g9gEW467cmzbmFR9n39FbNuAWCHZUYbAAAGELQBAGAAQRsAAAYQtAEAYABBGwAABhC0AQBgAEEbAAAGGBa0q+qcqrqtqq6cVzu1qm6pqg3T4+h55/6wqq6vqmur6sh59aOm2vVV9ZpR/QIAwFIaOaP93iRHbaH+tu5eOz0+lSRVdWCS45M8cbrmnVW1U1XtlOTMJM9JcmCSE6axAACwTRv2zZDd/dmqWrPA4cck+XB3353ka1V1fZLDpnPXd/dXk6SqPjyNvXqp+wUAgKU0izXar6iqy6elJbtPtb2T3DxvzMaptrX6FlXVKVW1vqrWb9q0aan7BgCABVvuoH1WkscnWZvk1iRvXcqbd/fZ3b2uu9etWrVqKW8NAACLMmzpyJZ09zc2H1fVu5P81+npLUn2mTd09VTL/dQBAGCbtawz2lW117ynv5Fk844kFyQ5vqp+vKr2S7J/ki8nuSTJ/lW1X1U9LHMfmLxgOXsGAIAHY9iMdlV9KMkzk+xZVRuTvCHJM6tqbZJOcmOSlyRJd19VVR/N3Iccv5/k5d1973SfVyT5dJKdkpzT3VeN6hkAAJbKyF1HTthC+T33M/6MJGdsof6pJJ9awtYAAGA43wwJAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADDAvaVXVOVd1WVVfOq725qv6uqi6vqk9U1SOn+pqq+m5VbZge75p3zaFVdUVVXV9Vf1pVNapnAABYKiNntN+b5Kj71C5M8qTuPijJf0/yh/PO3dDda6fHS+fVz0ryu0n2nx73vScAAGxzhgXt7v5skjvuU/ur7v7+9PSLSVbf3z2qaq8ku3X3F7u7k5yb5NcHtAsAAEtqlmu0X5zkL+c936+qvlJVn6mqX5xqeyfZOG/Mxqm2RVV1SlWtr6r1mzZtWvqOAQBggWYStKvqtUm+n+S8qXRrkn27+5Ak/zbJB6tqt8Xet7vP7u513b1u1apVS9cwAAAs0s7L/YJVdXKSX01yxLQcJN19d5K7p+NLq+qGJD+T5Jb86PKS1VMNAAC2acs6o11VRyX5d0l+rbvvmldfVVU7TcePy9yHHr/a3bcm+VZVPXXabeTEJJ9czp4BAODBGDajXVUfSvLMJHtW1cYkb8jcLiM/nuTCaZe+L047jDwjyWlVdU+SHyR5aXdv/iDlyzK3g8nDM7eme/66bgAA2CYNC9rdfcIWyu/ZytiPJ/n4Vs6tT/KkJWwNAACG882QAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADLChoV9VFC6kBAABzdr6/k1W1a5J/kWTPqto9SU2ndkuy9+DeAABgxbrfoJ3kJUleleQxSS7NPwftbyV5x7i2AABgZbvfoN3db0/y9qr6N939Z8vUEwAArHgPNKOdJOnuP6uqX0iyZv413X3uoL4AAGBFW1DQrqr3J3l8kg1J7p3KnUTQBgCALVhQ0E6yLsmB3d0jmwEAgO3FQvfRvjLJT49sBAAAticLndHeM8nVVfXlJHdvLnb3rw3pCgAAVriFBu1TRzYBAADbm4XuOvKZ0Y0AAMD2ZKG7jnw7c7uMJMnDkuyS5J+6e7dRjQEAwEq2oA9Ddvcjunu3KVg/PMlzk7zzga6rqnOq6raqunJe7VFVdWFVXTf93H2qV1X9aVVdX1WXV9XPzbvmpGn8dVV10qJ/SwAAWGYL3XXkh3rOf0ly5AKGvzfJUfepvSbJRd29f5KLpudJ8pwk+0+PU5KclcwF8yRvSHJ4ksOSvGFzOAcAgG3VQpeOHDvv6Y9lbl/t7z3Qdd392apac5/yMUmeOR2/L8n/neR/m+rnTnt1f7GqHllVe01jL+zuO6ZeLsxceP/QQnoHAIBZWOiuI/9q3vH3k9yYuWD8YDy6u2+djv8+yaOn472T3Dxv3MaptrX6/6CqTsncbHj23XffB9keAAA8dAvddeS3R7x4d3dVLdm3TXb32UnOTpJ169b5FksAAGZmQWu0q2p1VX1i+mDjbVX18apa/SBf8xvTkpBMP2+b6rck2WfeuNVTbWt1AADYZi30w5D/OckFSR4zPf6vqfZgXJBk884hJyX55Lz6idPuI09Ncue0xOTTSZ5dVbtPH4J89lQDAIBt1kLXaK/q7vnB+r1V9aoHuqiqPpS5DzPuWVUbM7d7yH9I8tGq+p0kX0/ygmn4p5IcneT6JHcl+e0k6e47quqNSS6Zxp22+YORAACwrVpo0L69qn4r/7zTxwlJbn+gi7r7hK2cOmILYzvJy7dyn3OSnLOwVgEAYPYWunTkxZmbef77JLcmeV6Skwf1BAAAK95CZ7RPS3JSd/9D8sMvkXlL5gI4AABwHwud0T5oc8hO5tZNJzlkTEsAALDyLTRo/9j8rz2fZrQXOhsOAAA7nIWG5bcm+UJVnT89f36SM8a0BAAAK99Cvxny3Kpan+RZU+nY7r56XFsAALCyLXj5xxSshWsAAFiAha7RBgAAFkHQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAGWPWhX1ROqasO8x7eq6lVVdWpV3TKvfvS8a/6wqq6vqmur6sjl7hkAABZr5+V+we6+NsnaJKmqnZLckuQTSX47ydu6+y3zx1fVgUmOT/LEJI9J8tdV9TPdfe9y9g0AAIsx66UjRyS5obu/fj9jjkny4e6+u7u/luT6JIctS3cAAPAgzTpoH5/kQ/Oev6KqLq+qc6pq96m2d5Kb543ZONX+B1V1SlWtr6r1mzZtGtMxAAAswMyCdlU9LMmvJTl/Kp2V5PGZW1Zya5K3Lvae3X12d6/r7nWrVq1aqlYBAGDRZjmj/Zwkf9vd30iS7v5Gd9/b3T9I8u788/KQW5LsM++61VMNAAC2WbMM2idk3rKRqtpr3rnfSHLldHxBkuOr6serar8k+yf58rJ1CQAAD8Ky7zqSJFX1E0l+JclL5pX/pKrWJukkN24+191XVdVHk1yd5PtJXm7HEQAAtnUzCdrd/U9J9rhP7UX3M/6MJGeM7gsAAJbKrHcdAQCA7ZKgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAAD7DzrBgBgOdx02pNn3cKi7fv6K2bdAvAQmNEGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAFmFrSr6saquqKqNlTV+qn2qKq6sKqum37uPtWrqv60qq6vqsur6udm1TcAACzErGe0/+fuXtvd66bnr0lyUXfvn+Si6XmSPCfJ/tPjlCRnLXunAACwCLMO2vd1TJL3TcfvS/Lr8+rn9pwvJnlkVe01g/4AAGBBZhm0O8lfVdWlVXXKVHt0d986Hf99kkdPx3snuXnetRun2o+oqlOqan1Vrd+0adOovgEA4AHtPMPXfnp331JVP5Xkwqr6u/knu7urqhdzw+4+O8nZSbJu3bpFXQsAAEtpZjPa3X3L9PO2JJ9IcliSb2xeEjL9vG0afkuSfeZdvnqqAQDANmkmQbuqfqKqHrH5OMmzk1yZ5IIkJ03DTkryyen4giQnTruPPDXJnfOWmAAAwDZnVktHHp3kE1W1uYcPdvd/q6pLkny0qn4nydeTvGAa/6kkRye5PsldSX57+VsGAICFm0nQ7u6vJjl4C/XbkxyxhXonefkytAYAAEtiW9veDwAAtguCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAyw86wb2BYc+upzZ93Colz65hNn3QIAAA/AjDYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAyx60q2qfqvqbqrq6qq6qqt+f6qdW1S1VtWF6HD3vmj+squur6tqqOnK5ewYAgMXaeQav+f0kf9Ddf1tVj0hyaVVdOJ17W3e/Zf7gqjowyfFJnpjkMUn+uqp+prvvXdauAQBgEZZ9Rru7b+3uv52Ov53kmiR7388lxyT5cHff3d1fS3J9ksPGdwoAAA/eTNdoV9WaJIck+dJUekVVXV5V51TV7lNt7yQ3z7tsY7YSzKvqlKpaX1XrN23aNKptAAB4QDML2lX1k0k+nuRV3f2tJGcleXyStUluTfLWxd6zu8/u7nXdvW7VqlVL2S4AACzKTIJ2Ve2SuZB9Xnf/eZJ09ze6+97u/kGSd+efl4fckmSfeZevnmoAALDNmsWuI5XkPUmu6e7/OK++17xhv5Hkyun4giTHV9WPV9V+SfZP8uXl6hcAAB6MWew68rQkL0pyRVVtmGp/lOSEqlqbpJPcmOQlSdLdV1XVR5NcnbkdS15uxxGA2Tv01efOuoVF+cQjZt0BsKNZ9qDd3Z9LUls49an7ueaMJGcMawoAAJaYb4YEAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhg51k3wOLddNqTZ93Cou37+itm3QIAwLIyow0AAAMI2gAAMIClIwCwgzn01efOuoVFu/TNJ866BVg0M9oAADCAoA0AAAOsmKBdVUdV1bVVdX1VvWbW/QAAwP1ZEUG7qnZKcmaS5yQ5MMkJVXXgbLsCAICtWykfhjwsyfXd/dUkqaoPJzkmydUz7QoeBPugA8COobp71j08oKp6XpKjuvtfT89flOTw7n7FfcadkuSU6ekTkly7rI0unz2TfHPWTfCgef9WNu/fyuW9W9m8fyvX9v7ePba7V23pxEqZ0V6Q7j47ydmz7mO0qlrf3etm3QcPjvdvZfP+rVzeu5XN+7dy7cjv3YpYo53kliT7zHu+eqoBAMA2aaUE7UuS7F9V+1XVw5Icn+SCGfcEAABbtSKWjnT396vqFUk+nWSnJOd091UzbmuWtvvlMds579/K5v1bubx3K5v3b+XaYd+7FfFhSAAAWGlWytIRAABYUQRtAAAYQNBeQapqTVVdOes+YEdVVa+sqmuq6rxZ98LC+bsTmJUV8WFIgG3Ey5L8cndvnHUjAGz7zGivPDtX1XnTrNrHqupfzLohFq6qTqyqy6vqsqp6/6z7YeGq6l1JHpfkL6vqf5l1Pzw4VfW4qvpKVT1l1r2wMFX1E1X1F9Pfm1dW1XGz7omFqaqnTP/N23V6H6+qqifNuq/lZNeRFaSq1iT5WpKnd/fnq+qcJFd391tm2xkLUVVPTPKJJL/Q3d+sqkd19x2z7ouFq6obk6zr7u35q4S3O9Pfnf81yXOTfDjJyd192UybYsGq6rlJjuru352e/8vuvnPGbbFAVXV6kl2TPDzJxu7+4xm3tKzMaK88N3f356fjDyR5+iybYVGeleT8zSFNyIZltSrJJ5P8ppC94lyR5Feq6k1V9YtC9opzWpJfSbIuyZ/MuJdlJ2ivPPf9Jwj/JAHwwO5MclNMTqw43f3fk/xc5gL36VX1+hm3xOLskeQnkzwiczPbOxRBe+XZt6p+fjp+YZLPzbIZFuXiJM+vqj2SpKoeNeN+YEfy/yX5jSQnVtULZ90MC1dVj0lyV3d/IMmbMxe6WTn+zyT/e5Lzkrxpxr0sO7uOrDzXJnn55vXZSc6acT8sUHdfVVVnJPlMVd2b5CtJTp5tV7Dj6O5/qqpfTXJhVX2nuy+YdU8syJOTvLmqfpDkniS/N+N+WKCqOjHJPd39waraKcn/U1XP6u6LZ93bcvFhSAAAGMDSEQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0Abgh6rq5Kp6x6z7ANgeCNoAADCAoA2wQlTViVV1eVVdVlXvr6p/VVVfqqqvVNVfV9Wjp3GnTue/UFXXVdXvbuV+z6+qK6f7fXbeqcdU1X+brv2TeePPqqr1VXVVVf37efUbq+pPquqKqvpyVf1PU31VVX28qi6ZHk8b9EcDsE3yzZAAK0BVPTHJ65L8Qnd/s6oelaSTPLW7u6r+dZJ/l+QPpksOSvLUJD+R5CtV9Rfd/f/e57avT3Jkd99SVY+cV1+b5JAkdye5tqr+rLtvTvLa7r5j+oa3i6rqoO6+fLrmzu5+8vRNcP9Hkl9N8vYkb+vuz1XVvkk+neSApftTAdi2CdoAK8Ozkpzf3d9MkinwPjnJR6pqryQPS/K1eeM/2d3fTfLdqvqbJIcl+S/3uefnk7y3qj6a5M/n1S/q7juTpKquTvLYJDcneUFVnZK5/3bsleTAJJuD9ofm/XzbdPzLSQ6sqs333a2qfrK7v/Mg/wwAVhRLRwBWrj9L8o7ufnKSlyTZdd65vs/YrqozqmpDVW1Iku5+aeZmyfdJcmlV7TGNvXvedfcm2bmq9kvyvyY5orsPSvIX9/N6m49/LHMz7munx95CNrAjEbQBVoaLkzx/cxielo78yyS3TOdPus/4Y6pq12n8M5Nc0t2v3Rx6p3s8vru/1N2vT7Ipc4F7a3ZL8k9J7pzWgj/nPuePm/fzC9PxXyX5N5sHVNXaBf6uANsFS0cAVoDuvqqqzkjymaq6N8lXkpya5Pyq+ofMBfH95l1yeZK/SbJnkjduYX12kry5qvZPUkkuSnJZ5tZnb+n1L6uqryT5u8wtI/n8fYbsXlWXZ242/ISp9sokZ071nZN8NslLF/N7A6xk1X3ff10EYCWrqlOTfKe737JMr3djknWb148DMMfSEQAAGMCMNgAADGBGGwAABhC0AQBgAEEbAAAGELQBAGAAQRsAAAb4/wH9SJ0Uta2yFwAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 864x504 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.figure(figsize=(12,7))\n",
    "sns.countplot(data=mushroom_data, x='cap-shape', hue='class')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ba18317d",
   "metadata": {},
   "source": [
    "In cap_shape, the letters stands for: `ell=b,conical=c,convex=x,flat=f, knobbed=k,sunken=s`. It seems that the convex type is dominant and most of it are edible. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "2b2f8e6e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<AxesSubplot:xlabel='cap-color', ylabel='count'>"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAtoAAAGpCAYAAACzsJHBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAe6klEQVR4nO3de7ReZX0n8O+vBA0oKpd4I2BwyrKCXJRwsXgbmGWRaYtVEbUKqB2s2qpji+NlBlwR27J06mhFHRQrUbxBy5h27FgGrLfxQkBuITKkKBBGJQa1eEGBPvPH2ZhDSOCE5Hn3Ocnns9a7zt7Pfvazf9lrcfLlyfPuXa21AAAAW9avjV0AAABsjQRtAADoQNAGAIAOBG0AAOhA0AYAgA7mjV1AD7vttltbtGjR2GUAALCVu+SSS37QWluwoWNbZdBetGhRli9fPnYZAABs5arq+o0ds3QEAAA6ELQBAKADQRsAADrYKtdoAwAwd9x+++1ZvXp1brvttrFL2aj58+dn4cKF2X777Wd8jqANAMCoVq9enZ122imLFi1KVY1dzj201rJ27dqsXr06e+2114zPs3QEAIBR3Xbbbdl1111nZchOkqrKrrvuuskz7oI2AACjm60h+y73pz5BGwAAOhC0AQCYk9761rfmne9859hlbJSgDQAAHQjaAADMCUuXLs3++++fAw44IC95yUvuduyDH/xgDj744BxwwAF57nOfm5/97GdJknPPPTdPeMITcsABB+RpT3takmTFihU55JBDcuCBB2b//ffPtdde26VeQRsAgFlvxYoVOe2003LRRRfl8ssvz7vf/e67HX/Oc56Tiy++OJdffnke//jH56yzzkqSLFmyJJ/73Ody+eWXZ9myZUmSD3zgA3nta1+byy67LMuXL8/ChQu71CxoAwAw61100UU59thjs9tuuyVJdtlll7sdv+qqq/LUpz41++23X84555ysWLEiSXL44YfnxBNPzAc/+MHceeedSZInP/nJ+bM/+7Ocfvrpuf7667PDDjt0qVnQBgBgzjvxxBPz3ve+N1deeWVOPfXUXz3z+gMf+EBOO+203HjjjTnooIOydu3avOhFL8qyZcuyww475Oijj85FF13UpSZBGwCAWe+II47Iueeem7Vr1yZJbrnllrsdv/XWW/OoRz0qt99+e84555xftf/zP/9zDj300CxZsiQLFizIjTfemOuuuy6Pfexj85rXvCbHHHNMrrjiii41ewU7AACz3r777pu3vOUtefrTn57tttsuT3ziE7No0aJfHX/b296WQw89NAsWLMihhx6aW2+9NUly8skn59prr01rLUceeWQOOOCAnH766fnoRz+a7bffPo985CPz5je/uUvN1VrrMvCYFi9e3JYvXz52GcAcd9DJSyd2rUvecfzErgUw26xcuTKPf/zjxy7jPm2ozqq6pLW2eEP9LR0BAIAOBG0AAOhA0AYAgA4EbQAA6EDQBgCADgRtAADowHO0AQCYVbb041XHeoSqGW0AAOhA0AYAYJv3sY99LIccckgOPPDAvOIVr8idd9652WMK2gAAbNNWrlyZT33qU/nKV76Syy67LNttt13OOeeczR7XGm0AALZpF154YS655JIcfPDBSZKf//znefjDH77Z4wraAABs01prOeGEE/Lnf/7nW3RcS0cAANimHXnkkTnvvPNy8803J0luueWWXH/99Zs9rhltAABmlUk/jm+fffbJaaedlmc+85n513/912y//fY544wz8pjHPGazxhW0AQDY5h133HE57rjjtuiY3ZaOVNWHq+rmqrpqWts7qupbVXVFVZ1fVQ+bduxNVbWqqq6pqt+a1n7U0Laqqt7Yq14AANiSeq7R/kiSo9ZruyDJE1pr+yf5v0nelCRVtU+SFyTZdzjnfVW1XVVtl+SMJM9Ksk+SFw59AQBgVusWtFtrX0xyy3pt/9hau2PY/VqShcP2MUk+2Vr7RWvt20lWJTlk+KxqrV3XWvtlkk8OfQEAYFYb86kjL0vyD8P27klunHZs9dC2sfZ7qKqTqmp5VS1fs2ZNh3IBAGDmRgnaVfWWJHck2fxX7gxaa2e21ha31hYvWLBgSw0LAAD3y8SfOlJVJyb57SRHttba0HxTkj2mdVs4tOVe2gEAYNaaaNCuqqOSvCHJ01trP5t2aFmSj1fVXyZ5dJK9k3wjSSXZu6r2ylTAfkGSF02yZgAAJuuGJftt0fH2POXKLTreTHUL2lX1iSTPSLJbVa1OcmqmnjLywCQXVFWSfK219oettRVV9ekkV2dqScmrW2t3DuP8UZLPJdkuyYdbayt61QwAAFtKt6DdWnvhBprPupf+b0/y9g20fzbJZ7dgaQAAcDff+c53ctRRR+Wggw7KpZdemn333TdLly7NjjvueL/HHPOpIwAAMGtcc801edWrXpWVK1fmIQ95SN73vvdt1niCNgAAJNljjz1y+OGHJ0le/OIX58tf/vJmjSdoAwBAkuE7hBvd31SCNgAAJLnhhhvy1a9+NUny8Y9/PE95ylM2a7yJP0cbAADuzViP43vc4x6XM844Iy972cuyzz775JWvfOVmjSdoAwBAknnz5uVjH/vYFhvP0hEAAOhA0AYAYJu3aNGiXHXVVVt0TEEbAIDRtdbGLuFe3Z/6BG0AAEY1f/78rF27dtaG7dZa1q5dm/nz52/Seb4MCQDAqBYuXJjVq1dnzZo1Y5eyUfPnz8/ChQs36RxBGwCAUW2//fbZa6+9xi5ji7N0BAAAOhC0AQCgA0EbAAA6ELQBAKADQRsAADoQtAEAoANBGwAAOhC0AQCgA0EbAAA6ELQBAKADQRsAADoQtAEAoANBGwAAOhC0AQCgA0EbAAA6ELQBAKADQRsAADoQtAEAoANBGwAAOhC0AQCgA0EbAAA6ELQBAKADQRsAADoQtAEAoANBGwAAOhC0AQCgA0EbAAA6ELQBAKADQRsAADoQtAEAoIN5YxcAzC4Hnbx0Yte65B3HT+xaADBpZrQBAKADQRsAADqwdARgFrhhyX4Tu9aep1w5sWsBbMvMaAMAQAeCNgAAdNAtaFfVh6vq5qq6alrbLlV1QVVdO/zceWivqnpPVa2qqiuq6knTzjlh6H9tVZ3Qq14AANiSes5ofyTJUeu1vTHJha21vZNcOOwnybOS7D18Tkry/mQqmCc5NcmhSQ5Jcupd4RwAAGazbkG7tfbFJLes13xMkrOH7bOTPHta+9I25WtJHlZVj0ryW0kuaK3d0lr7YZILcs/wDgAAs86k12g/orX23WH7e0keMWzvnuTGaf1WD20ba7+HqjqpqpZX1fI1a9Zs2aoBAGATjfZlyNZaS9K24HhnttYWt9YWL1iwYEsNCwAA98ukg/b3hyUhGX7ePLTflGSPaf0WDm0bawcAgFlt0kF7WZK7nhxyQpLPTGs/fnj6yGFJfjwsMflckmdW1c7DlyCfObQBAMCs1u3NkFX1iSTPSLJbVa3O1NND/iLJp6vq5UmuT/L8oftnkxydZFWSnyV5aZK01m6pqrcluXjot6S1tv4XLAEAYNbpFrRbay/cyKEjN9C3JXn1Rsb5cJIPb8HSAACgO2+GBACADgRtAADoQNAGAIAOBG0AAOhA0AYAgA4EbQAA6EDQBgCADgRtAADoQNAGAIAOBG0AAOhA0AYAgA4EbQAA6EDQBgCADgRtAADoQNAGAIAOBG0AAOhA0AYAgA4EbQAA6EDQBgCADgRtAADoQNAGAIAOBG0AAOhA0AYAgA4EbQAA6EDQBgCADgRtAADoQNAGAIAOBG0AAOhA0AYAgA4EbQAA6GDe2AXAtuaGJftN7Fp7nnLlxK4FANydGW0AAOhA0AYAgA4EbQAA6EDQBgCADgRtAADoQNAGAIAOBG0AAOhA0AYAgA4EbQAA6EDQBgCADgRtAADoQNAGAIAOBG0AAOhA0AYAgA4EbQAA6EDQBgCADgRtAADoYJSgXVX/sapWVNVVVfWJqppfVXtV1deralVVfaqqHjD0feCwv2o4vmiMmgEAYFPMm/QFq2r3JK9Jsk9r7edV9ekkL0hydJJ3tdY+WVUfSPLyJO8ffv6wtfbrVfWCJKcnOW7SdQNb3g1L9pvYtfY85cqJXQsAkvGWjsxLskNVzUuyY5LvJjkiyXnD8bOTPHvYPmbYz3D8yKqqyZUKAACbbuJBu7V2U5J3JrkhUwH7x0kuSfKj1todQ7fVSXYftndPcuNw7h1D/13XH7eqTqqq5VW1fM2aNX3/EAAAcB8mHrSraudMzVLvleTRSR6U5KjNHbe1dmZrbXFrbfGCBQs2dzgAANgsYywd+XdJvt1aW9Nauz3J3yY5PMnDhqUkSbIwyU3D9k1J9kiS4fhDk6ydbMkAALBpxgjaNyQ5rKp2HNZaH5nk6iSfT/K8oc8JST4zbC8b9jMcv6i11iZYLwAAbLIx1mh/PVNfarw0yZVDDWcm+U9JXl9VqzK1Bvus4ZSzkuw6tL8+yRsnXTMAAGyqiT/eL0laa6cmOXW95uuSHLKBvrclOXYSdQEAwJbizZAAANCBoA0AAB0I2gAA0IGgDQAAHQjaAADQgaANAAAdCNoAANCBoA0AAB0I2gAA0IGgDQAAHQjaAADQgaANAAAdCNoAANCBoA0AAB0I2gAA0IGgDQAAHQjaAADQwbyxCwAAmOsOOnnpxK51yTuOn9i12DxmtAEAoANBGwAAOhC0AQCgA0EbAAA6ELQBAKCDGQXtqrpwJm0AAMCUe328X1XNT7Jjkt2qauckNRx6SJLdO9cGAABz1n09R/sVSV6X5NFJLsm6oP0vSd7brywAAJjb7jVot9beneTdVfXHrbW/mlBNAAAw583ozZCttb+qqt9Msmj6Oa21yb0GCQAA5pAZBe2q+miSf5PksiR3Ds0tiaANAAAbMKOgnWRxkn1aa61nMQAAsLWY6XO0r0ryyJ6FAADA1mSmM9q7Jbm6qr6R5Bd3NbbWfrdLVQAAcB9uWLLfRK6z5ylX3q/zZhq033q/RgcAgG3UTJ868oXehQAAwNZkpk8duTVTTxlJkgck2T7JT1trD+lVGAAAzGUzndHe6a7tqqokxyQ5rFdRAAAw1830qSO/0qb8jyS/teXLAQCArcNMl448Z9rur2Xqudq3dakIAAC2AjN96sjvTNu+I8l3MrV8BAAA2ICZrtF+ae9CAABgazKjNdpVtbCqzq+qm4fP31TVwt7FAQDAXDXTL0P+dZJlSR49fP5uaAMAADZgpkF7QWvtr1trdwyfjyRZ0LEuAACY02YatNdW1Yurarvh8+Ika3sWBgAAc9lMg/bLkjw/yfeSfDfJ85Kc2KkmAACY82b6eL8lSU5orf0wSapqlyTvzFQAhznvoJOXTuxa5+90330AgLlvpjPa+98VspOktXZLkif2KQkAAOa+mQbtX6uqne/aGWa0ZzobDgAA25yZhuX/muSrVXXusH9skrf3KQkAAOa+Gc1ot9aWJnlOku8Pn+e01j56fy9aVQ+rqvOq6ltVtbKqnlxVu1TVBVV17fBz56FvVdV7qmpVVV1RVU+6v9cFAIBJmenSkbTWrm6tvXf4XL2Z1313kv/VWvuNJAckWZnkjUkubK3tneTCYT9JnpVk7+FzUpL3b+a1AQCguxkH7S2lqh6a5GlJzkqS1tovW2s/SnJMkrOHbmcnefawfUySpW3K15I8rKoeNdGiAQBgE008aCfZK8maJH9dVd+sqg9V1YOSPKK19t2hz/eSPGLY3j3JjdPOXz203U1VnVRVy6tq+Zo1azqWDwAA922MoD0vyZOSvL+19sQkP826ZSJJktZaS9I2ZdDW2pmttcWttcULFng7PAAA4xojaK9Osrq19vVh/7xMBe/v37UkZPh583D8piR7TDt/4dAGAACz1sSDdmvte0lurKrHDU1HJrk6ybIkJwxtJyT5zLC9LMnxw9NHDkvy42lLTAAAYFYa66Uzf5zknKp6QJLrkrw0U6H/01X18iTXJ3n+0PezSY5OsirJz4a+AAAwq40StFtrlyVZvIFDR26gb0vy6t41AQDAljTGGm0AANjqCdoAANCBoA0AAB0I2gAA0IGgDQAAHQjaAADQgaANAAAdCNoAANCBoA0AAB0I2gAA0IGgDQAAHQjaAADQgaANAAAdCNoAANCBoA0AAB0I2gAA0IGgDQAAHQjaAADQgaANAAAdCNoAANCBoA0AAB0I2gAA0IGgDQAAHQjaAADQgaANAAAdCNoAANCBoA0AAB0I2gAA0IGgDQAAHQjaAADQgaANAAAdCNoAANCBoA0AAB0I2gAA0IGgDQAAHQjaAADQgaANAAAdCNoAANCBoA0AAB0I2gAA0IGgDQAAHQjaAADQgaANAAAdCNoAANCBoA0AAB0I2gAA0IGgDQAAHQjaAADQwWhBu6q2q6pvVtXfD/t7VdXXq2pVVX2qqh4wtD9w2F81HF80Vs0AADBTY85ovzbJymn7pyd5V2vt15P8MMnLh/aXJ/nh0P6uoR8AAMxqowTtqlqY5N8n+dCwX0mOSHLe0OXsJM8eto8Z9jMcP3LoDwAAs9a8ka7735K8IclOw/6uSX7UWrtj2F+dZPdhe/ckNyZJa+2Oqvrx0P8H0wesqpOSnJQke+65Z8/aAQDYiINOXjqxa52/0333GdPEZ7Sr6reT3Nxau2RLjttaO7O1tri1tnjBggVbcmgAANhkY8xoH57kd6vq6CTzkzwkybuTPKyq5g2z2guT3DT0vynJHklWV9W8JA9NsnbyZQMAwMxNfEa7tfam1trC1tqiJC9IclFr7feTfD7J84ZuJyT5zLC9bNjPcPyi1lqbYMkAALDJZtNztP9TktdX1apMrcE+a2g/K8muQ/vrk7xxpPoAAGDGxvoyZJKktfZPSf5p2L4uySEb6HNbkmMnWhgAAGym2TSjDQAAWw1BGwAAOhC0AQCgA0EbAAA6ELQBAKADQRsAADoQtAEAoANBGwAAOhC0AQCgA0EbAAA6ELQBAKADQRsAADoQtAEAoANBGwAAOhC0AQCgA0EbAAA6ELQBAKCDeWMXAADAzN2wZL+JXGfPU66cyHW2Zma0AQCgA0EbAAA6ELQBAKADQRsAADoQtAEAoANBGwAAOhC0AQCgA0EbAAA6ELQBAKADQRsAADoQtAEAoANBGwAAOhC0AQCgA0EbAAA6ELQBAKADQRsAADoQtAEAoANBGwAAOhC0AQCgA0EbAAA6ELQBAKADQRsAADoQtAEAoANBGwAAOhC0AQCgA0EbAAA6ELQBAKADQRsAADoQtAEAoANBGwAAOph40K6qParq81V1dVWtqKrXDu27VNUFVXXt8HPnob2q6j1VtaqqrqiqJ026ZgAA2FRjzGjfkeRPWmv7JDksyaurap8kb0xyYWtt7yQXDvtJ8qwkew+fk5K8f/IlAwDAppl40G6tfbe1dumwfWuSlUl2T3JMkrOHbmcnefawfUySpW3K15I8rKoeNdmqAQBg04y6RruqFiV5YpKvJ3lEa+27w6HvJXnEsL17khunnbZ6aFt/rJOqanlVLV+zZk2/ogEAYAZGC9pV9eAkf5Pkda21f5l+rLXWkrRNGa+1dmZrbXFrbfGCBQu2YKUAALDpRgnaVbV9pkL2Oa21vx2av3/XkpDh581D+01J9ph2+sKhDQAAZq0xnjpSSc5KsrK19pfTDi1LcsKwfUKSz0xrP354+shhSX48bYkJAADMSvNGuObhSV6S5Mqqumxoe3OSv0jy6ap6eZLrkzx/OPbZJEcnWZXkZ0leOtFqAQDgfph40G6tfTlJbeTwkRvo35K8umtRAACwhXkzJAAAdCBoAwBAB4I2AAB0IGgDAEAHgjYAAHQgaAMAQAeCNgAAdCBoAwBAB4I2AAB0IGgDAEAHgjYAAHQgaAMAQAeCNgAAdCBoAwBAB4I2AAB0IGgDAEAHgjYAAHQwb+wCtmY3LNlvYtfa85QrJ3YtAADumxltAADoQNAGAIAOBG0AAOhA0AYAgA4EbQAA6EDQBgCADgRtAADoQNAGAIAOBG0AAOjAmyEBmFW8VRfYWpjRBgCADgRtAADoQNAGAIAOBG0AAOhA0AYAgA4EbQAA6EDQBgCADgRtAADowAtrALhPB528dGLXOn+niV0KoCsz2gAA0ME2N6NtVgYAgEkwow0AAB0I2gAA0IGgDQAAHQjaAADQwTb3ZUjWmeQXQy95x/ETuxYAwGxgRhsAADoQtAEAoANBGwAAOrBGGwA2ge+3rONewL0TtJmIG5bsN7Fr7XnKlRO7FgDAxsyZpSNVdVRVXVNVq6rqjWPXAwAA92ZOBO2q2i7JGUmelWSfJC+sqn3GrQoAADZuTgTtJIckWdVau6619sskn0xyzMg1AQDARlVrbewa7lNVPS/JUa21Pxj2X5Lk0NbaH03rc1KSk4bdxyW5ZuKF3tNuSX4wdhGzhHuxjnuxjnuxjnuxjnuxjnuxjnuxjnuxzmy4F49prS3Y0IGt5suQrbUzk5w5dh3TVdXy1triseuYDdyLddyLddyLddyLddyLddyLddyLddyLdWb7vZgrS0duSrLHtP2FQxsAAMxKcyVoX5xk76raq6oekOQFSZaNXBMAAGzUnFg60lq7o6r+KMnnkmyX5MOttRUjlzUTs2opy8jci3Xci3Xci3Xci3Xci3Xci3Xci3Xci3Vm9b2YE1+GBACAuWauLB0BAIA5RdAGAIAOBO0OqmpRVV01dh0AAIxH0AaAWa6m+Dsb5hj/0fYzr6rOqaqVVXVeVe04dkFjqarjq+qKqrq8qj46dj1jqaoXV9U3quqyqvrvVbXd2DWNpar+S1VdU1VfrqpPVNWfjl3TGIZ//VpZVR+sqhVV9Y9VtcPYdY1huBff8ntzneGeXFNVS5Nclbu/T2KbsP6/EFfVn1bVW0csaRRVdXJVvWbYfldVXTRsH1FV54xb3eRV1ZKqet20/bdX1WtHLGmjBO1+Hpfkfa21xyf5lySvGrmeUVTVvkn+c5IjWmsHJJmV/yH0VlWPT3JcksNbawcmuTPJ749a1Eiq6uAkz01yQJJnJZm1b/SakL2TnNFa2zfJjzJ1b7ZVfm/e096Zuif7ttauH7sYRvOlJE8dthcneXBVbT+0fXG0qsbz4STHJ8nwLz0vSPKxUSvaCEG7nxtba18Ztj+W5CljFjOiI5Kc21r7QZK01m4ZuZ6xHJnkoCQXV9Vlw/5jR61oPIcn+Uxr7bbW2q1J/m7sgkb27dbaZcP2JUkWjVfK6PzevKfrW2tfG7sIRndJkoOq6iFJfpHkq5kK3E/NVAjfprTWvpNkbVU9Mckzk3yztbZ23Ko2bE68sGaOWv8B5R5Yvm2rJGe31t40diHMOr+Ytn1nkm1y6cjA7817+unYBYzsjtx9UnD+WIWMqbV2e1V9O8mJSf5PkiuS/Nskv55k5YiljelDmbofj8zUDPesZEa7nz2r6snD9ouSfHnMYkZ0UZJjq2rXJKmqXUauZywXJnleVT08mboPVfWYkWsay1eS/E5Vza+qByf57bELYtbwe5P1fT/Jw6tq16p6YLbt3xdfSvKnmVoq8qUkf5ipmdxt9X9Iz09yVJKDM/Xm8FlJ0O7nmiSvrqqVSXZO8v6R6xlFa21Fkrcn+UJVXZ7kL0cuaRSttasztVb9H6vqiiQXJHnUuFWNo7V2cZJlmZqR+YckVyb58ahFMVv4vcndtNZuT7IkyTcy9XvzW+NWNKovZervja+21r6f5LZsg8tG7tJa+2WSzyf5dGvtzrHr2RivYAcmrqoe3Fr7yfBUiS8mOam1dunYdTGeqlqU5O9ba08YuxZg9hu+BHlpkmNba9eOXc/GmNEGxnDm8KXQS5P8jZANwExV1T5JViW5cDaH7MSMNgAAdGFGGwAAOhC0AQCgA0EbAAA6ELQB2GRV9ZOxawCY7QRtALqqKm8hBrZJgjbAHFRVx1fVFVV1eVV9tKp+p6q+XlXfrKr/XVWPGPq9dTj+1aq6tqr+w0bGe0RVnT+Md3lV/ebQ/vqqumr4vG4D51VVvWM4fmVVHTe0P6OqvlRVy5Jc3e9OAMxeZhkA5piq2jdTbxr9zdbaD6pqlyQtyWGttVZVf5DkDUn+ZDhl/ySHJXlQkm9W1f9srf2/9YZ9T5IvtNZ+r6q2S/LgqjooyUuTHJqkkny9qr7QWvvmtPOek+TAJAck2S3JxVX1xeHYk5I8obX27S16AwDmCDPaAHPPEUnOba39IElaa7ckWZjkc1V1ZZKTk+w7rf9nWms/H/p/PskhGxnz/cN4d7bWfpzkKUnOb639tLX2kyR/m+Sp6533lCSfGM75fpIvJDl4OPYNIRvYlgnaAFuHv0ry3tbafklekWT+tGPrv5msVdXbq+qy4Q2dvfy049gAs56gDTD3XJTk2KraNUmGpSMPTXLTcPyE9fofU1Xzh/7PSHJxa+0trbUDW2sHDn0uTPLKYbztquqhSb6U5NlVtWNVPSjJ7w1t030pyXHDOQuSPC3JN7bgnxVgzrJGG2COaa2tqKq3J/lCVd2Z5JtJ3prk3Kr6YaaC+F7TTrkiU0tGdkvytg2sz06S1yY5s6penuTOJK9srX21qj6SdcH5Q+utz06S85M8OcnlmZo5f0Nr7XtV9Rtb4I8KMKdVa+v/iyIAW4uqemuSn7TW3jl2LQDbGktHAACgAzPaAADQgRltAADoQNAGAIAOBG0AAOhA0AYAgA4EbQAA6OD/A6GD8Ch0M9YbAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 864x504 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.figure(figsize=(12,7))\n",
    "\n",
    "sns.countplot(data=mushroom_data, x='cap-color', hue='class')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "884b9f13",
   "metadata": {},
   "source": [
    "The above is the cap color. The alphabets stands for `brown=n,buff=b,cinnamon=c,gray=g,green=r,pink=p,purple=u,red=e,white=w,yellow=y `. \n",
    "\n",
    "Also it seems that most caps are brown(n), either edible or brown."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "b7c788ca",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<AxesSubplot:xlabel='population', ylabel='count'>"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAtoAAAGpCAYAAACzsJHBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAdCklEQVR4nO3df9Cl5Vkf8O8VFhJ/JAHCioSlXUa3pkQNwZWgUZNCA4RWiZYYogZMaVc7oIm1VmJnSoIyE2sjGjXpEFmBaCUkMWaN1EjJD3+MBJZA+CmyBVLYIWEDhCQywUKu/vE+G4+4u7wL7/2efZfPZ+bMeZ7ruZ/nXGfOsPvdh/vcp7o7AADA0nrGvBsAAIC9kaANAAADCNoAADCAoA0AAAMI2gAAMMCqeTcwwkEHHdRr166ddxsAAOzlrr322s919+odHdsrg/batWuzefPmebcBAMBerqo+vbNjpo4AAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADLBq3g0AAMvrN3/2j+bdwl7vrLd9/7xbYA8w/I52Ve1TVddV1Yem/cOr6hNVtaWq3lNV+031Z077W6bja2eu8aapfltVnTC6ZwAAeKqWY+rIG5LcOrP/y0nO7+5vTvJgkjOm+hlJHpzq50/jUlVHJDk1yQuTnJjkHVW1zzL0DQAAT9rQoF1Va5L8qyS/Pe1XkmOTvG8acnGSV03bJ0/7mY4fN40/Ocml3f1Id9+ZZEuSo0f2DQAAT9XoO9q/luQ/J/nKtP+8JJ/v7ken/XuSHDptH5rk7iSZjj80jf9qfQfnfFVVbaiqzVW1edu2bUv8NgAAYPcMC9pV9a+T3Nfd1456jVndfUF3r+/u9atXr16OlwQAgJ0auerIS5P8QFWdlORZSZ6T5NeT7F9Vq6a71muSbJ3Gb01yWJJ7qmpVkucmuX+mvt3sOQAAsEcadke7u9/U3Wu6e20Wvsz4ke7+0SQfTXLKNOz0JB+ctjdN+5mOf6S7e6qfOq1KcniSdUmuHtU3AAAshXmso/3zSS6tql9Kcl2SC6f6hUneXVVbkjyQhXCe7r65qi5LckuSR5Oc2d2PLX/bAACweMsStLv7Y0k+Nm3fkR2sGtLdX07y6p2cf16S88Z1CAAAS8tPsAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwwLGhX1bOq6uqq+lRV3VxVb5nqF1XVnVV1/fQ4cqpXVb29qrZU1Q1VddTMtU6vqtunx+mjegYAgKWyauC1H0lybHd/qar2TfIXVfW/pmM/193ve9z4VyZZNz1ekuSdSV5SVQcmOSfJ+iSd5Nqq2tTdDw7sHQAAnpJhd7R7wZem3X2nR+/ilJOTXDKdd1WS/avqkCQnJLmiux+YwvUVSU4c1TcAACyFoXO0q2qfqro+yX1ZCMufmA6dN00POb+qnjnVDk1y98zp90y1ndUf/1obqmpzVW3etm3bUr8VAADYLUODdnc/1t1HJlmT5Oiq+tYkb0rygiTfmeTAJD+/RK91QXev7+71q1evXopLAgDAk7Ysq4509+eTfDTJid197zQ95JEkv5Pk6GnY1iSHzZy2ZqrtrA4AAHuskauOrK6q/aftr0nyiiR/Pc27TlVVklcluWk6ZVOS06bVR45J8lB335vkw0mOr6oDquqAJMdPNQAA2GONXHXkkCQXV9U+WQj0l3X3h6rqI1W1OkkluT7JT07jL09yUpItSR5O8vok6e4HquoXk1wzjTu3ux8Y2DcAADxlw4J2d9+Q5MU7qB+7k/Gd5MydHNuYZOOSNggAAAP5ZUgAABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABhgXtqnpWVV1dVZ+qqpur6i1T/fCq+kRVbamq91TVflP9mdP+lun42plrvWmq31ZVJ4zqGQAAlsrIO9qPJDm2u1+U5MgkJ1bVMUl+Ocn53f3NSR5McsY0/owkD07186dxqaojkpya5IVJTkzyjqraZ2DfAADwlA0L2r3gS9PuvtOjkxyb5H1T/eIkr5q2T572Mx0/rqpqql/a3Y90951JtiQ5elTfAACwFIbO0a6qfarq+iT3Jbkiyf9J8vnufnQack+SQ6ftQ5PcnSTT8YeSPG+2voNzZl9rQ1VtrqrN27ZtG/BuAABg8YYG7e5+rLuPTLImC3ehXzDwtS7o7vXdvX716tWjXgYAABZlWVYd6e7PJ/loku9Ksn9VrZoOrUmyddremuSwJJmOPzfJ/bP1HZwDAAB7pJGrjqyuqv2n7a9J8ookt2YhcJ8yDTs9yQen7U3TfqbjH+nunuqnTquSHJ5kXZKrR/UNAABLYdUTD3nSDkly8bRCyDOSXNbdH6qqW5JcWlW/lOS6JBdO4y9M8u6q2pLkgSysNJLuvrmqLktyS5JHk5zZ3Y8N7BsAAJ6yYUG7u29I8uId1O/IDlYN6e4vJ3n1Tq51XpLzlrpHAAAYxS9DAgDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwADDgnZVHVZVH62qW6rq5qp6w1R/c1Vtrarrp8dJM+e8qaq2VNVtVXXCTP3Eqbalqs4e1TMAACyVVQOv/WiSn+3uT1bVs5NcW1VXTMfO7+7/Pju4qo5IcmqSFyZ5fpL/XVX/bDr8W0lekeSeJNdU1abuvmVg7wAA8JQMC9rdfW+Se6ftL1bVrUkO3cUpJye5tLsfSXJnVW1JcvR0bEt335EkVXXpNFbQBgBgj7Usc7Sram2SFyf5xFQ6q6puqKqNVXXAVDs0yd0zp90z1XZWBwCAPdbwoF1VX5/k/Une2N1fSPLOJN+U5Mgs3PF+2xK9zoaq2lxVm7dt27YUlwQAgCdtaNCuqn2zELJ/r7v/IEm6+7Pd/Vh3fyXJu/L300O2Jjls5vQ1U21n9X+guy/o7vXdvX716tVL/2YAAGA3jFx1pJJcmOTW7v7VmfohM8N+MMlN0/amJKdW1TOr6vAk65JcneSaJOuq6vCq2i8LX5jcNKpvAABYCiNXHXlpktclubGqrp9qv5DktVV1ZJJOcleSn0iS7r65qi7LwpccH01yZnc/liRVdVaSDyfZJ8nG7r55YN8AAPCUjVx15C+S1A4OXb6Lc85Lct4O6pfv6jwAANjT+GVIAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAARYVtKvqysXUAACABat2dbCqnpXka5McVFUHJKnp0HOSHDq4NwAAWLF2GbST/ESSNyZ5fpJr8/dB+wtJfnNcWwAAsLLtMmh3968n+fWq+qnu/o1l6gkAAFa8J7qjnSTp7t+oqu9Osnb2nO6+ZFBfAACwoi0qaFfVu5N8U5Lrkzw2lTuJoA0AADuwqKCdZH2SI7q7RzYDAAB7i8Wuo31Tkm8c2QgAAOxNFntH+6Akt1TV1Uke2V7s7h8Y0hUAAKxwiw3abx7ZBAAA7G0Wu+rIx0c3AgAAe5PFrjryxSysMpIk+yXZN8nfdvdzRjUGAAAr2WLvaD97+3ZVVZKTkxwzqikAAFjpFrvqyFf1gj9McsLStwMAAHuHxU4d+aGZ3WdkYV3tLw/pCAAA9gKLXXXk+2e2H01yVxamjwAAADuw2Dnarx/dCAAA7E0WNUe7qtZU1Qeq6r7p8f6qWjO6OQAAWKkW+2XI30myKcnzp8cfTbWdqqrDquqjVXVLVd1cVW+Y6gdW1RVVdfv0fMBUr6p6e1VtqaobquqomWudPo2/vapOfzJvFAAAltNig/bq7v6d7n50elyUZPUTnPNokp/t7iOysBTgmVV1RJKzk1zZ3euSXDntJ8krk6ybHhuSvDNZCOZJzknykiRHJzlnezgHAIA91WKD9v1V9WNVtc/0+LEk9+/qhO6+t7s/OW1/McmtSQ7NwpcoL56GXZzkVdP2yUkumZYPvCrJ/lV1SBaWEbyiux/o7geTXJHkxMW/RQAAWH6LDdr/NskPJ/lMknuTnJLkxxf7IlW1NsmLk3wiycHdfe906DNJDp62D01y98xp90y1ndUf/xobqmpzVW3etm3bYlsDAIAhFhu0z01yenev7u5vyELwfstiTqyqr0/y/iRv7O4vzB7r7s7f/7T7U9LdF3T3+u5ev3r1E81qAQCAsRYbtL99mraRJOnuB7Jwh3qXqmrfLITs3+vuP5jKn52mhGR6vm+qb01y2Mzpa6bazuoAALDHWmzQfsbsFxCnLyjucg3uqqokFya5tbt/debQpiTbVw45PckHZ+qnTauPHJPkoWmKyYeTHF9VB0w9HD/VAABgj7XYX4Z8W5K/qqr3TvuvTnLeE5zz0iSvS3JjVV0/1X4hyVuTXFZVZyT5dBbmfifJ5UlOSrIlycNJXp8s3D2vql9Mcs007tzpjjoAAOyxFvvLkJdU1eYkx06lH+ruW57gnL9IUjs5fNwOxneSM3dyrY1JNi6mVwAA2BMs9o52pmC9y3ANAAAsWOwcbQAAYDcI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAOsmncDAAAs3nk/dsq8W9jr/Zfffd+SXMcdbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGEDQBgCAAQRtAAAYYFjQrqqNVXVfVd00U3tzVW2tquunx0kzx95UVVuq6raqOmGmfuJU21JVZ4/qFwAAltLIO9oXJTlxB/Xzu/vI6XF5klTVEUlOTfLC6Zx3VNU+VbVPkt9K8sokRyR57TQWAAD2aKtGXbi7/6yq1i5y+MlJLu3uR5LcWVVbkhw9HdvS3XckSVVdOo29Zan7BQCApTSPOdpnVdUN09SSA6baoUnunhlzz1TbWf0fqaoNVbW5qjZv27ZtRN8AALBoyx2035nkm5IcmeTeJG9bqgt39wXdvb67169evXqpLgsAAE/KsKkjO9Ldn92+XVXvSvKhaXdrksNmhq6ZatlFHQAA9ljLeke7qg6Z2f3BJNtXJNmU5NSqemZVHZ5kXZKrk1yTZF1VHV5V+2XhC5OblrNnAAB4Mobd0a6q30/y8iQHVdU9Sc5J8vKqOjJJJ7kryU8kSXffXFWXZeFLjo8mObO7H5uuc1aSDyfZJ8nG7r55VM8AALBURq468todlC/cxfjzkpy3g/rlSS5fwtYAAGA4vwwJAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAA6yadwMArDwf/76XzbuFp4WX/dnH590C8BS4ow0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADCAoA0AAAMI2gAAMICgDQAAAwjaAAAwgKANAAADCNoAADDAsKBdVRur6r6qummmdmBVXVFVt0/PB0z1qqq3V9WWqrqhqo6aOef0afztVXX6qH4BAGApjbyjfVGSEx9XOzvJld29LsmV036SvDLJuumxIck7k4VgnuScJC9JcnSSc7aHcwAA2JMNC9rd/WdJHnhc+eQkF0/bFyd51Uz9kl5wVZL9q+qQJCckuaK7H+juB5NckX8c3gEAYI+z3HO0D+7ue6ftzyQ5eNo+NMndM+PumWo7q/8jVbWhqjZX1eZt27YtbdcAALCb5vZlyO7uJL2E17ugu9d39/rVq1cv1WUBAOBJWe6g/dlpSkim5/um+tYkh82MWzPVdlYHAIA92nIH7U1Jtq8ccnqSD87UT5tWHzkmyUPTFJMPJzm+qg6YvgR5/FQDAIA92qpRF66q30/y8iQHVdU9WVg95K1JLquqM5J8OskPT8MvT3JSki1JHk7y+iTp7geq6heTXDONO7e7H/8FSwAA2OMMC9rd/dqdHDpuB2M7yZk7uc7GJBuXsDUAABjOL0MCAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMMJegXVV3VdWNVXV9VW2eagdW1RVVdfv0fMBUr6p6e1VtqaobquqoefQMAAC7Y553tP9Fdx/Z3eun/bOTXNnd65JcOe0nySuTrJseG5K8c9k7BQCA3bQnTR05OcnF0/bFSV41U7+kF1yVZP+qOmQO/QEAwKLNK2h3kj+tqmurasNUO7i77522P5Pk4Gn70CR3z5x7z1T7B6pqQ1VtrqrN27ZtG9U3AAAsyqo5ve73dPfWqvqGJFdU1V/PHuzurqrenQt29wVJLkiS9evX79a5AACw1OZyR7u7t07P9yX5QJKjk3x2+5SQ6fm+afjWJIfNnL5mqgEAwB5r2YN2VX1dVT17+3aS45PclGRTktOnYacn+eC0vSnJadPqI8ckeWhmigkAAOyR5jF15OAkH6iq7a//P7v7T6rqmiSXVdUZST6d5Ien8ZcnOSnJliQPJ3n98rcMAAC7Z9mDdnffkeRFO6jfn+S4HdQ7yZnL0BoAACyZPWl5PwAA2GsI2gAAMICgDQAAA8xrHW2AvPQ3XjrvFvZ6f/lTfznvFgCetgTtJN/xc5fMu4W93rW/ctq8WwAAWFamjgAAwACCNgAADCBoAwDAAOZos6L933O/bd4t7PX+yX+9cd4tAMCK5I42AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAwgaAMAwACCNgAADCBoAwDAAII2AAAMIGgDAMAAgjYAAAywYoJ2VZ1YVbdV1ZaqOnve/QAAwK6siKBdVfsk+a0kr0xyRJLXVtUR8+0KAAB2bkUE7SRHJ9nS3Xd0998luTTJyXPuCQAAdqq6e949PKGqOiXJid3976b91yV5SXefNTNmQ5IN0+63JLlt2RtdPgcl+dy8m+BJ8/mtXD67lc3nt7L5/Fauvf2z+6fdvXpHB1YtdyejdPcFSS6Ydx/Loao2d/f6effBk+PzW7l8diubz29l8/mtXE/nz26lTB3ZmuSwmf01Uw0AAPZIKyVoX5NkXVUdXlX7JTk1yaY59wQAADu1IqaOdPejVXVWkg8n2SfJxu6+ec5tzdPTYorMXsznt3L57FY2n9/K5vNbuZ62n92K+DIkAACsNCtl6ggAAKwogjYAAAwgaAMAwACCNgAADCBorzBV9YdVdW1V3Tz9GiYrSFWdVlU3VNWnqurd8+6HxamqtVV1a1W9a/pv70+r6mvm3ReLU1VfV1V/PP13d1NVvWbePbE4VfXWqjpzZv/NVfWf5tkTi1dV51bVG2f2z6uqN8yxpWVn1ZEVpqoO7O4Hpr/kr0nysu6+f9598cSq6oVJPpDku7v7c9s/y3n3xROrqrVJtiRZ393XV9VlSTZ19+/OtzMWo6r+TZITu/vfT/vP7e6H5twWi1BVL07ya939smn/liQndPfd8+2MxZj+7PyD7j6qqp6R5PYkRz+dcos72ivPT1fVp5JclYVfy1w3535YvGOTvLe7P5ckQvaKc2d3Xz9tX5tk7fxaYTfdmOQVVfXLVfW9QvbK0d3XJfmGqnp+Vb0oyYNC9srR3XcluX/6B9PxSa57OoXsZIX8YA0LqurlSf5lku/q7oer6mNJnjXPnuBp5JGZ7ceSmDqyQnT331TVUUlOSvJLVXVld587775YtPcmOSXJNyZ5z5x7Yff9dpIfz8Lnt3G+rSw/d7RXludm4V/zD1fVC5IcM++G2C0fSfLqqnpesjANaM79wNNCVT0/ycPTVJ9fSXLUnFti97wnyalZCNvvnXMv7L4PJDkxyXdm4Re+n1bc0V5Z/iTJT1bVrUluy8L0EVaI7r65qs5L8vGqeizJdVn4Vz4w1rcl+ZWq+kqS/5fkP8y5H3bD9Gfns5Ns7e57590Pu6e7/66qPprk89392Lz7WW6+DAkAwBDTlyA/meTV3X37vPtZbqaOAACw5KrqiCys2HTl0zFkJ+5oAwDAEO5oAwDAAII2AAAMIGgDAMAAgjYAqaq1VXXTIsb8yMz++qp6+/juAFYmQRuAxVqb5KtBu7s3d/dPz68dgD2boA2wAkx3k/+6qn6vqm6tqvdV1ddW1XFVdV1V3VhVG6vqmdP4u6rqv031q6vqm6f6RVV1ysx1v7ST1/rzqvrk9Pju6dBbk3xvVV1fVT9TVS+vqg9N5xxYVX9YVTdU1VVV9e1T/c1TXx+rqjuqSjAHnjYEbYCV41uSvKO7/3mSLyT5j0kuSvKa7v62LPza7+yvHj401X8zya/txuvcl+QV3X1Uktck2T495Owkf97dR3b3+Y875y1Jruvub0/yC0kumTn2giQnJDk6yTlVte9u9AKwYgnaACvH3d39l9P27yY5Lsmd3f03U+3iJN83M/73Z56/azdeZ98k76qqG5O8N8kRizjne5K8O0m6+yNJnldVz5mO/XF3P9Ldn8tCiD94N3oBWLFWzbsBABbt8b8w9vkkz1vk+O3bj2a6yTL9NPJ+OzjvZ5J8NsmLprFffhK9znpkZvux+LsHeJpwRxtg5fgnVbX9zvSPJNmcZO32+ddJXpfk4zPjXzPz/FfT9l1JvmPa/oEs3L1+vOcmube7vzJdc5+p/sUkz95Jb3+e5EeTpKpenuRz3f2FxbwpgL2VuwoAK8dtSc6sqo1Jbkny00muSvLeqlqV5Jok/2Nm/AFVdUMW7ii/dqq9K8kHq+pTSf4kyd/u4HXekeT9VXXa48bckOSx6dyLklw3c86bk2ycXu/hJKc/tbcKsPJV9+P/TyQAe5qqWpvkQ939rYscf1eS9dO8aADmwNQRAAAYwB1tAAAYwB1tAAAYQNAGAIABBG0AABhA0AYAgAEEbQAAGOD/AxEusHM+HTldAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 864x504 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.figure(figsize=(12,7))\n",
    "\n",
    "sns.countplot(data=mushroom_data, x='population')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8988d4ce",
   "metadata": {},
   "source": [
    "The most populations are most several. Here are what the letters stand for: `abundant=a,clustered=c,numerous=n, scattered=s,several=v,solitary=y`.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "4820e0ed",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<AxesSubplot:xlabel='habitat', ylabel='count'>"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAtoAAAGpCAYAAACzsJHBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAZY0lEQVR4nO3df7DldX3f8ddb1h9JNQHDhiKQLjWbGEwq6ookxsZoFKSxaGIcTFViaVenkMaOpuOPSTUaZtJWw8QfIYMjARIrUn/U1SEliI4oUWFR5KeELWhgB2EVf9aRBnz3j/tdc1x3l7tyP/fce/fxmLlzv+fz/Z5z33uYgSff/d7vqe4OAACwtB4w7wEAAGAtEtoAADCA0AYAgAGENgAADCC0AQBggHXzHmCEgw8+uDds2DDvMQAAWOOuvPLKL3f3+t3tW5OhvWHDhmzdunXeYwAAsMZV1Rf3tM+lIwAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABlg37wHm6fG/f968R1h1rvzvL5r3CAAAq4Iz2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADDAvtqnpIVV1eVZ+rquuq6g+n9SOr6tNVta2q3l1VD5rWHzw93jbt3zDzWq+a1m+squNGzQwAAEtl5Bntu5M8tbsfk+ToJMdX1bFJ/muSM7r7p5N8Nckp0/GnJPnqtH7GdFyq6qgkJyV5dJLjk/xZVR0wcG4AALjfhoV2L/jW9PCB01cneWqS90zr5yZ59rR94vQ40/6nVVVN6+d3993dfUuSbUmOGTU3AAAshaHXaFfVAVV1VZI7k1yc5P8k+Vp33zMdcluSw6btw5LcmiTT/q8n+YnZ9d08Z/Znba6qrVW1dceOHQP+NAAAsHhDQ7u77+3uo5McnoWz0I8a+LPO6u5N3b1p/fr1o34MAAAsyrLcdaS7v5bko0l+McmBVbVu2nV4ku3T9vYkRyTJtP/Hk3xldn03zwEAgBVp5F1H1lfVgdP2jyR5epIbshDcz50OOznJB6btLdPjTPs/0t09rZ803ZXkyCQbk1w+am4AAFgK6+77kB/aoUnOne4Q8oAkF3T3h6rq+iTnV9UfJflskndMx78jyV9W1bYkd2XhTiPp7uuq6oIk1ye5J8mp3X3vwLkBAOB+Gxba3X11ksfuZv3m7OauId39nSS/tYfXOj3J6Us9IwAAjOKTIQEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYIBhoV1VR1TVR6vq+qq6rqp+b1p/XVVtr6qrpq8TZp7zqqraVlU3VtVxM+vHT2vbquqVo2YGAIClsm7ga9+T5OXd/ZmqeliSK6vq4mnfGd39xtmDq+qoJCcleXSSRyT5cFX9zLT7bUmenuS2JFdU1Zbuvn7g7AAAcL8MC+3uvj3J7dP2N6vqhiSH7eUpJyY5v7vvTnJLVW1Lcsy0b1t335wkVXX+dKzQBgBgxVqWa7SrakOSxyb59LR0WlVdXVVnV9VB09phSW6dedpt09qe1nf9GZuramtVbd2xY8dS/xEAAGCfDA/tqnpokvcmeVl3fyPJmUkemeToLJzxftNS/JzuPqu7N3X3pvXr1y/FSwIAwA9t5DXaqaoHZiGy39nd70uS7r5jZv/bk3xoerg9yREzTz98Wste1gEAYEUaedeRSvKOJDd095/MrB86c9hzklw7bW9JclJVPbiqjkyyMcnlSa5IsrGqjqyqB2XhFya3jJobAACWwsgz2k9K8sIk11TVVdPaq5M8v6qOTtJJvpDkJUnS3ddV1QVZ+CXHe5Kc2t33JklVnZbkoiQHJDm7u68bODcAANxvI+868okktZtdF+7lOacnOX036xfu7XkAALDS+GRIAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGWDfvAdh//f3rf2HeI6wqP/Vfrpn3CADAPnBGGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMMCy0q+qIqvpoVV1fVddV1e9N6w+vqour6qbp+0HTelXVm6tqW1VdXVWPm3mtk6fjb6qqk0fNDAAAS2XkGe17kry8u49KcmySU6vqqCSvTHJJd29Mcsn0OEmemWTj9LU5yZnJQpgneW2SJyY5Jslrd8Y5AACsVMNCu7tv7+7PTNvfTHJDksOSnJjk3Omwc5M8e9o+Mcl5veBTSQ6sqkOTHJfk4u6+q7u/muTiJMePmhsAAJbCslyjXVUbkjw2yaeTHNLdt0+7vpTkkGn7sCS3zjzttmltT+sAALBiDQ/tqnpokvcmeVl3f2N2X3d3kl6in7O5qrZW1dYdO3YsxUsCAMAPbWhoV9UDsxDZ7+zu903Ld0yXhGT6fue0vj3JETNPP3xa29P69+nus7p7U3dvWr9+/dL+QQAAYB+NvOtIJXlHkhu6+09mdm1JsvPOIScn+cDM+oumu48cm+Tr0yUmFyV5RlUdNP0S5DOmNQAAWLHWDXztJyV5YZJrquqqae3VSf44yQVVdUqSLyZ53rTvwiQnJNmW5NtJXpwk3X1XVb0hyRXTca/v7rsGzg0AAPfbsNDu7k8kqT3sftpuju8kp+7htc5OcvbSTQcAAGP5ZEgAABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADLCq0q+qSxawBAAAL1u1tZ1U9JMmPJjm4qg5KUtOuH0ty2ODZAABg1dpraCd5SZKXJXlEkivzj6H9jSRvHTcWAACsbnsN7e7+0yR/WlW/291vWaaZAABg1buvM9pJku5+S1X9UpINs8/p7vMGzQUAAKvaokK7qv4yySOTXJXk3mm5kwhtAADYjUWFdpJNSY7q7h45DAAArBWLvY/2tUn+6chBAABgLVnsGe2Dk1xfVZcnuXvnYnf/6yFTAQDAKrfY0H7dyCEAAGCtWexdRz42ehAAAFhLFnvXkW9m4S4jSfKgJA9M8n+7+8dGDQYAAKvZYs9oP2zndlVVkhOTHDtqKAAAWO0We9eR7+kF/yvJcUs/DgAArA2LvXTkN2YePiAL99X+zpCJAABgDVjsXUeeNbN9T5IvZOHyEQAAYDcWe432i0cPAgAAa8mirtGuqsOr6v1Vdef09d6qOnz0cAAAsFot9pch/yLJliSPmL4+OK0BAAC7sdjQXt/df9Hd90xf5yRZP3AuAABY1RYb2l+pqhdU1QHT1wuSfGXkYAAAsJotNrT/bZLnJflSktuTPDfJ7wyaCQAAVr3F3t7v9UlO7u6vJklVPTzJG7MQ4AAAwC4We0b7X+yM7CTp7ruSPHbMSAAAsPotNrQfUFUH7XwwndFe7NlwAADY7yw2tN+U5JNV9YaqekOSv03y3/b2hKo6e7rn9rUza6+rqu1VddX0dcLMvldV1baqurGqjptZP35a21ZVr9y3Px4AAMzHYj8Z8ryq2prkqdPSb3T39ffxtHOSvDXJebusn9Hdb5xdqKqjkpyU5NFZuE/3h6vqZ6bdb0vy9CS3JbmiqrYs4mcDAMBcLfryjyluFx243X1pVW1Y5OEnJjm/u+9OcktVbUtyzLRvW3ffnCRVdf50rNAGAGBFW+ylI0vptKq6erq0ZOd134cluXXmmNumtT2t/4Cq2lxVW6tq644dO0bMDQAAi7bcoX1mkkcmOToL9+N+01K9cHef1d2bunvT+vU+tBIAgPla1juHdPcdO7er6u1JPjQ93J7kiJlDD5/Wspd1AABYsZb1jHZVHTrz8DlJdt6RZEuSk6rqwVV1ZJKNSS5PckWSjVV1ZFU9KAu/MLllOWcGAIAfxrAz2lX1riRPSXJwVd2W5LVJnlJVRyfpJF9I8pIk6e7rquqCLPyS4z1JTu3ue6fXOS3JRUkOSHJ2d183amYAAFgqw0K7u5+/m+V37OX405Ocvpv1C5NcuISjAQDAcPO46wgAAKx5QhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADDAunkPAADL6a0v/+C8R1hVTnvTs+Y9AqxazmgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADDAstKvq7Kq6s6qunVl7eFVdXFU3Td8Pmtarqt5cVduq6uqqetzMc06ejr+pqk4eNS8AACylkWe0z0ly/C5rr0xySXdvTHLJ9DhJnplk4/S1OcmZyUKYJ3ltkicmOSbJa3fGOQAArGTDQru7L01y1y7LJyY5d9o+N8mzZ9bP6wWfSnJgVR2a5LgkF3f3Xd391SQX5wfjHQAAVpzlvkb7kO6+fdr+UpJDpu3Dktw6c9xt09qe1n9AVW2uqq1VtXXHjh1LOzUAAOyjuf0yZHd3kl7C1zuruzd196b169cv1csCAMAPZblD+47pkpBM3++c1rcnOWLmuMOntT2tAwDAirbcob0lyc47h5yc5AMz6y+a7j5ybJKvT5eYXJTkGVV10PRLkM+Y1gAAYEVbN+qFq+pdSZ6S5OCqui0Ldw/54yQXVNUpSb6Y5HnT4RcmOSHJtiTfTvLiJOnuu6rqDUmumI57fXfv+guWAACw4gwL7e5+/h52PW03x3aSU/fwOmcnOXsJRwMAgOF8MiQAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAOvmPQCw/J70lifNe4RV57LfvWzeIwCwyjijDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGGAuoV1VX6iqa6rqqqraOq09vKourqqbpu8HTetVVW+uqm1VdXVVPW4eMwMAwL6Y5xntX+3uo7t70/T4lUku6e6NSS6ZHifJM5NsnL42Jzlz2ScFAIB9tJIuHTkxybnT9rlJnj2zfl4v+FSSA6vq0DnMBwAAizav0O4kf1NVV1bV5mntkO6+fdr+UpJDpu3Dktw689zbprXvU1Wbq2prVW3dsWPHqLkBAGBR1s3p5/5yd2+vqp9McnFVfX52Z3d3VfW+vGB3n5XkrCTZtGnTPj0XAACW2lzOaHf39un7nUnen+SYJHfsvCRk+n7ndPj2JEfMPP3waQ0AAFasZQ/tqvonVfWwndtJnpHk2iRbkpw8HXZykg9M21uSvGi6+8ixSb4+c4kJAACsSPO4dOSQJO+vqp0//3909/+uqiuSXFBVpyT5YpLnTcdfmOSEJNuSfDvJi5d/ZAAA2DfLHtrdfXOSx+xm/StJnrab9U5y6jKMBgAAS2Yl3d4PAADWDKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYIB5fAQ7ALAfOv0Fz533CKvOa/7qPfMegfvBGW0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAA/jAGoBl9rF/+SvzHmHV+ZVLPzbvEQD2mTPaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMsG7eAwAAMN4Np39k3iOsKj/3mqfe79dwRhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwwKoJ7ao6vqpurKptVfXKec8DAAB7sypCu6oOSPK2JM9MclSS51fVUfOdCgAA9mxVhHaSY5Js6+6bu/v/JTk/yYlzngkAAPaounveM9ynqnpukuO7+99Nj1+Y5IndfdrMMZuTbJ4e/mySG5d90KVzcJIvz3uI/Zj3f768//PjvZ8v7/98ef/nZ7W/9/+su9fvbse65Z5klO4+K8lZ855jKVTV1u7eNO859lfe//ny/s+P936+vP/z5f2fn7X83q+WS0e2Jzli5vHh0xoAAKxIqyW0r0iysaqOrKoHJTkpyZY5zwQAAHu0Ki4d6e57quq0JBclOSDJ2d193ZzHGmlNXAKzinn/58v7Pz/e+/ny/s+X939+1ux7vyp+GRIAAFab1XLpCAAArCpCGwAABhDaK1hVva6qXjHvOWC5VNW35j0DACwVoQ0AAAMI7RWmql5TVX9XVZ/Iwidcskyq6g+q6saq+kRVvcvfJrDWVdWGqvp8VZ0z/XvnnVX1a1V1WVXdVFXHzHvG/cHMP4d3VtUNVfWeqvrRec+1P5je+2tnHr+iql43x5H2G1X1+1X1H6ftM6rqI9P2U6vqnfOdbukI7RWkqh6fhXuEH53khCRPmOtA+5GqekKS30zymCTPTLImP6EKduOnk7wpyaOmr99O8stJXpHk1XOca3/zs0n+rLt/Lsk3kvyHOc8Do308yZOn7U1JHlpVD5zWLp3bVEtMaK8sT07y/u7+dnd/Iz6UZzk9KckHuvs73f3NJB+c90CwTG7p7mu6+7tJrktySS/c9/WaJBvmOtn+5dbuvmza/qss/M8OrGVXJnl8Vf1YkruTfDILwf3kLET4mrAqPrAGgGHuntn+7szj78Z/I5bTrh9q4UMulsc9+f6Tjg+Z1yD7m+7+h6q6JcnvJPnbJFcn+dUs/C3bDXMcbUk5o72yXJrk2VX1I1X1sCTPmvdA+5HLkjyrqh5SVQ9N8uvzHgjYr/xUVf3itP3bST4xz2H2I3ck+cmq+omqenD8u3+5fTwLl6ldOm2/NMlnew19mqLQXkG6+zNJ3p3kc0n+OskV851o/9HdV2ThUp2rs/DeX5Pk63MdCtif3Jjk1Kq6IclBSc6c8zz7he7+hySvT3J5kouTfH6+E+13Pp7k0CSf7O47knwna+iykcRHsMP3VNVDu/tb02/7X5pk8/Q/PwDDVNWGJB/q7p+f9yzA0nL9Hfyjs6rqqCxco3euyAYA7g9ntAEAYADXaAMAwABCGwAABhDaAAAwgNAGWAOqakNVXbsPx59TVc/dzfqmqnrztP2UqvqlRbzWoo4D2N+46wgA39PdW5NsnR4+Jcm3svCpbXuz2OMA9ivOaAOsHQdU1dur6rqq+pvpU2b/fVVdUVWfq6r3TveJ3+nXqmprVf1dVf168r2z0x+a7u380iT/qaquqqonV9WzqurTVfXZqvpwVR2yu+OW/U8NsEIJbYC1Y2OSt3X3o5N8LclvJnlfdz+hux+T5IYkp8wcvyHJMUn+VZI/r6qH7NzR3V9I8udJzujuo7v741n4WPBju/uxSc5P8p/3cBwAcekIwFpyS3dfNW1fmYWQ/vmq+qMkByZ5aJKLZo6/oLu/m+Smqro5yaPu4/UPT/Luqjo0yYOS3LJ0owOsPc5oA6wdd89s35uFkynnJDmtu38hyR9m4ZNPd9r1E8vu6xPM3pLkrdNrvWSX1wJgF0IbYG17WJLbq+qBSf7NLvt+q6oeUFWPTPLPk9y4y/5vTs/f6ceTbJ+2T97LcQBEaAOsdX+Q5NNJLkvy+V32/X2Sy5P8dZKXdvd3dtn/wSTPmfklx9cl+Z9VdWWSL+/lOACSVPd9/U0hAACwr5zRBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAG+P+KhjME3gYWpAAAAABJRU5ErkJggg==\n",
      "text/plain": [
       "<Figure size 864x504 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.figure(figsize=(12,7))\n",
    "\n",
    "sns.countplot(data=mushroom_data, x='habitat')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "id": "740b3356",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<AxesSubplot:xlabel='stalk-root', ylabel='count'>"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAtoAAAGpCAYAAACzsJHBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAa1UlEQVR4nO3dfdCld13f8c+XLE8qmMSsacgGk9qtGhQWXAOIDxBKnvoQtEhDBRaks1oTR6ZqgeoYBDOj9YECldhYIolSY0QpK6TiNkQcGCHZ6BLyQMo2QJM1kJXEAKKpid/+cV+xN2F3c3b3/t3nPruv18yZPed3rnOd73KG5T0X17lOdXcAAICV9Yh5DwAAAIcjoQ0AAAMIbQAAGEBoAwDAAEIbAAAGWDfvAUY47rjj+uSTT573GAAAHOauv/76v+ju9Xt77rAM7ZNPPjk7duyY9xgAABzmqupT+3rOqSMAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAZYN+8B1oJv/YnL5z3CYe/6X3jpvEcAAFhVjmgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwwL7ap6TFVdW1UfqaqbqupnpvW3VdUnqmrndNs0rVdVvamqdlXVDVX1tGX72lJVH59uW0bNDAAAK2XdwH3fl+T07v5CVT0yyQeq6n9Mz/1Ed7/jIdufnWTjdHt6kouTPL2qjk1yYZLNSTrJ9VW1rbvvGTg7AAAckmFHtHvJF6aHj5xuvZ+XnJvk8ul1H0pydFWdkOTMJNu7++4prrcnOWvU3AAAsBKGnqNdVUdV1c4kd2Uplj88PXXRdHrIG6rq0dPaiUluX/byO6a1fa0/9L22VtWOqtqxZ8+elf6rAADAARka2t39QHdvSrIhyWlV9c1JXpPkG5N8W5Jjk7xqhd7rku7e3N2b169fvxK7BACAg7YqVx3p7r9Mck2Ss7r7zun0kPuS/HqS06bNdic5adnLNkxr+1oHAIA1a+RVR9ZX1dHT/ccmeV6Sj03nXaeqKsnzk9w4vWRbkpdOVx95RpJ7u/vOJO9NckZVHVNVxyQ5Y1oDAIA1a+RVR05IcllVHZWloL+yu99dVe+rqvVJKsnOJD80bX9VknOS7EryxSQvT5LuvruqXp/kumm713X33QPnBgCAQzYstLv7hiRP3cv66fvYvpOcv4/nLk1y6YoOCAAAA/llSAAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYIBhoV1Vj6mqa6vqI1V1U1X9zLR+SlV9uKp2VdVvV9WjpvVHT493Tc+fvGxfr5nWb62qM0fNDAAAK2XkEe37kpze3U9JsinJWVX1jCQ/n+QN3f2PktyT5BXT9q9Ics+0/oZpu1TVqUnOS/KkJGcleUtVHTVwbgAAOGTDQruXfGF6+Mjp1klOT/KOaf2yJM+f7p87Pc70/HOrqqb1K7r7vu7+RJJdSU4bNTcAAKyEoedoV9VRVbUzyV1Jtif530n+srvvnza5I8mJ0/0Tk9yeJNPz9yb5muXre3nN8vfaWlU7qmrHnj17BvxtAABgdkNDu7sf6O5NSTZk6Sj0Nw58r0u6e3N3b16/fv2otwEAgJmsylVHuvsvk1yT5JlJjq6qddNTG5Lsnu7vTnJSkkzPf3WSzy5f38trAABgTRp51ZH1VXX0dP+xSZ6X5JYsBfcLps22JHnXdH/b9DjT8+/r7p7Wz5uuSnJKko1Jrh01NwAArIR1D7/JQTshyWXTFUIekeTK7n53Vd2c5Iqq+tkkf5bkrdP2b03yG1W1K8ndWbrSSLr7pqq6MsnNSe5Pcn53PzBwbgAAOGTDQru7b0jy1L2s35a9XDWku/8myfftY18XJblopWcEAIBR/DIkAAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwwLDQrqqTquqaqrq5qm6qqh+d1l9bVburaud0O2fZa15TVbuq6taqOnPZ+lnT2q6qevWomQEAYKWsG7jv+5P8WHf/aVU9Lsn1VbV9eu4N3f2LyzeuqlOTnJfkSUmekOR/VtU/np7+lSTPS3JHkuuqalt33zxwdgAAOCTDQru770xy53T/81V1S5IT9/OSc5Nc0d33JflEVe1Kctr03K7uvi1JquqKaVuhDQDAmrUq52hX1clJnprkw9PSBVV1Q1VdWlXHTGsnJrl92cvumNb2tf7Q99haVTuqaseePXtW+q8AAAAHZHhoV9VXJfndJK/s7s8luTjJ1yfZlKUj3r+0Eu/T3Zd09+bu3rx+/fqV2CUAABy0kedop6oemaXIfnt3/16SdPdnlj3/a0nePT3cneSkZS/fMK1lP+sAALAmjbzqSCV5a5JbuvuXl62fsGyz70ly43R/W5LzqurRVXVKko1Jrk1yXZKNVXVKVT0qS1+Y3DZqbgAAWAkjj2g/K8lLkny0qnZOa/8hyYuqalOSTvLJJD+YJN19U1VdmaUvOd6f5PzufiBJquqCJO9NclSSS7v7poFzAwDAIRt51ZEPJKm9PHXVfl5zUZKL9rJ+1f5eBwAAa41fhgQAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABZgrtqrp6ljUAAGDJuv09WVWPSfIVSY6rqmOS1PTU45OcOHg2AABYWPsN7SQ/mOSVSZ6Q5Pr8/9D+XJL/PG4sAABYbPsN7e5+Y5I3VtWPdPebV2kmAABYeA93RDtJ0t1vrqpvT3Ly8td09+WD5gIAgIU2U2hX1W8k+fokO5M8MC13EqENAAB7MVNoJ9mc5NTu7ll3XFUnZSnEj89SlF/S3W+sqmOT/HaWjo5/MskLu/ueqqokb0xyTpIvJnlZd//ptK8tSX5q2vXPdvdls84BAADzMOt1tG9M8g8OcN/3J/mx7j41yTOSnF9VpyZ5dZKru3tjkqunx0lydpKN021rkouTZArzC5M8PclpSS6croACAABr1qxHtI9LcnNVXZvkvgcXu/tf7OsF3X1nkjun+5+vqluydEnAc5M8e9rssiR/lORV0/rl01HzD1XV0VV1wrTt9u6+O0mqanuSs5L81oyzAwDAqps1tF97KG9SVScneWqSDyc5forwJPl0lk4tSZYi/PZlL7tjWtvX+kPfY2uWjoTniU984qGMCwAAh2zWq468/2DfoKq+KsnvJnlld39u6VTsv99vV9XM533vT3dfkuSSJNm8efOK7BMAAA7WrD/B/vmq+tx0+5uqeqCqPjfD6x6Zpch+e3f/3rT8memUkEx/3jWt705y0rKXb5jW9rUOAABr1kyh3d2P6+7Hd/fjkzw2yb9M8pb9vWa6ishbk9zS3b+87KltSbZM97ckedey9ZfWkmckuXc6xeS9Sc6oqmOmL0GeMa0BAMCaNetVR/5eL/nvSc58mE2fleQlSU6vqp3T7ZwkP5fkeVX18ST/ZHqcJFcluS3JriS/luSHp/e7O8nrk1w33V734BcjAQBgrZr1B2u+d9nDR2Tputp/s7/XdPcHktQ+nn7uXrbvJOfvY1+XJrl0llkBAGAtmPWqI/982f37s/RDM+eu+DQAAHCYmPWqIy8fPQgAABxOZr3qyIaqemdV3TXdfreqNoweDgAAFtWsX4b89SxdFeQJ0+33pzUAAGAvZg3t9d396919/3R7W5L1A+cCAICFNmtof7aqXlxVR023Fyf57MjBAABgkc0a2j+Q5IVJPp3kziQvSPKyQTMBAMDCm/Xyfq9LsqW770mSqjo2yS9mKcABAICHmPWI9pMfjOzk73+t8aljRgIAgMU3a2g/oqqOefDBdER71qPhAABwxJk1ln8pyZ9U1e9Mj78vyUVjRgIAgMU36y9DXl5VO5KcPi19b3ffPG4sAABYbDOf/jGFtbgGAIAZzHqONgAAcACENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADDAstKvq0qq6q6puXLb22qraXVU7p9s5y557TVXtqqpbq+rMZetnTWu7qurVo+YFAICVNPKI9tuSnLWX9Td096bpdlWSVNWpSc5L8qTpNW+pqqOq6qgkv5Lk7CSnJnnRtC0AAKxp60btuLv/uKpOnnHzc5Nc0d33JflEVe1Kctr03K7uvi1JquqKadubV3peAABYSfM4R/uCqrphOrXkmGntxCS3L9vmjmltX+tfpqq2VtWOqtqxZ8+eEXMDAMDMVju0L07y9Uk2JbkzyS+t1I67+5Lu3tzdm9evX79SuwUAgIMy7NSRvenuzzx4v6p+Lcm7p4e7k5y0bNMN01r2sw4AAGvWqh7RrqoTlj38niQPXpFkW5LzqurRVXVKko1Jrk1yXZKNVXVKVT0qS1+Y3LaaMwMAwMEYdkS7qn4rybOTHFdVdyS5MMmzq2pTkk7yySQ/mCTdfVNVXZmlLznen+T87n5g2s8FSd6b5Kgkl3b3TaNmBgCAlTLyqiMv2svyW/ez/UVJLtrL+lVJrlrB0QAAYDi/DAkAAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAHWzXsA4Mj1rDc/a94jHPY++CMfnPcIAEcsR7QBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADDAstKvq0qq6q6puXLZ2bFVtr6qPT38eM61XVb2pqnZV1Q1V9bRlr9kybf/xqtoyal4AAFhJI49ovy3JWQ9Ze3WSq7t7Y5Krp8dJcnaSjdNta5KLk6UwT3JhkqcnOS3JhQ/GOQAArGXDQru7/zjJ3Q9ZPjfJZdP9y5I8f9n65b3kQ0mOrqoTkpyZZHt3393d9yTZni+PdwAAWHNW+xzt47v7zun+p5McP90/Mcnty7a7Y1rb1/qXqaqtVbWjqnbs2bNnZacGAIADNLcvQ3Z3J+kV3N8l3b25uzevX79+pXYLAAAHZbVD+zPTKSGZ/rxrWt+d5KRl222Y1va1DgAAa9pqh/a2JA9eOWRLknctW3/pdPWRZyS5dzrF5L1JzqiqY6YvQZ4xrQEAwJq2btSOq+q3kjw7yXFVdUeWrh7yc0murKpXJPlUkhdOm1+V5Jwku5J8McnLk6S7766q1ye5btrudd390C9YAgDAmjMstLv7Rft46rl72baTnL+P/Vya5NIVHA0AAIbzy5AAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYIB18x4ADsX/ed23zHuEw94Tf/qj8x4BABaSI9oAADCA0AYAgAHmEtpV9cmq+mhV7ayqHdPasVW1vao+Pv15zLReVfWmqtpVVTdU1dPmMTMAAByIeR7Rfk53b+ruzdPjVye5urs3Jrl6epwkZyfZON22Jrl41ScFAIADtJZOHTk3yWXT/cuSPH/Z+uW95ENJjq6qE+YwHwAAzGxeod1J/rCqrq+qrdPa8d1953T/00mOn+6fmOT2Za+9Y1r7ElW1tap2VNWOPXv2jJobAABmMq/L+31Hd++uqq9Nsr2qPrb8ye7uquoD2WF3X5LkkiTZvHnzAb0WAABW2lyOaHf37unPu5K8M8lpST7z4Ckh0593TZvvTnLSspdvmNYAAGDNWvXQrqqvrKrHPXg/yRlJbkyyLcmWabMtSd413d+W5KXT1UeekeTeZaeYAADAmjSPU0eOT/LOqnrw/f9bd/9BVV2X5MqqekWSTyV54bT9VUnOSbIryReTvHz1RwYAgAOz6qHd3bclecpe1j+b5Ll7We8k56/CaAAAsGLW0uX9AADgsCG0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADDAunkPAMDief93ffe8RzgifPcfv3/eIwCHwBFtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBggIUJ7ao6q6purapdVfXqec8DAAD7sxChXVVHJfmVJGcnOTXJi6rq1PlOBQAA+7YQoZ3ktCS7uvu27v6/Sa5Icu6cZwIAgH2q7p73DA+rql6Q5Kzu/jfT45ckeXp3X7Bsm61Jtk4PvyHJras+6Oo5LslfzHsIDprPb3H57Babz2+x+fwW1+H+2X1dd6/f2xPrVnuSUbr7kiSXzHuO1VBVO7p787zn4OD4/BaXz26x+fwWm89vcR3Jn92inDqyO8lJyx5vmNYAAGBNWpTQvi7Jxqo6paoeleS8JNvmPBMAAOzTQpw60t33V9UFSd6b5Kgkl3b3TXMea56OiFNkDmM+v8Xls1tsPr/F5vNbXEfsZ7cQX4YEAIBFsyinjgAAwEIR2gAAMIDQXiBVdXJV3TjvOQAAeHhCG4AjSi3xv3/AcP6hWTzrqurtVXVLVb2jqr5i3gMxu6p6aVXdUFUfqarfmPc8zK6qXlxV11bVzqr6L1V11LxnYnbT/yN4a1VdnuTGfOlvM7BGPfT/ya2qH6+q185xJA5AVf3Q9G/mzqr6RFVdM++ZVpvQXjzfkOQt3f1NST6X5IfnPA8zqqonJfmpJKd391OS/OicR2JGVfVNSf5Vkmd196YkDyT5/rkOxcHYmKV/P5/U3Z+a9zBwuOvuX53+zfy2JHck+eX5TrT6hPbiub27Pzjd/80k3zHPYTggpyf5ne7+iyTp7rvnPA+ze26Sb01yXVXtnB7/w7lOxMH4VHd/aN5DwBHojUne192/P+9BVttC/GANX+KhFz53IXQYr5Jc1t2vmfcgHJK/mvcAHLD786UHBR8zr0E4OFX1siRfl+SCOY8yF45oL54nVtUzp/v/OskH5jkMB+R9Sb6vqr4mSarq2DnPw+yuTvKCqvraZOmzq6qvm/NMcCT4TJKvraqvqapHJ/ln8x6I2VXVtyb58SQv7u6/m/c88yC0F8+tSc6vqluSHJPk4jnPw4y6+6YkFyV5f1V9JEfguWqLqrtvztL59X9YVTck2Z7khPlOBYe/7v7bJK9Lcm2W/nv3sflOxAG6IMmxSa6ZvhD5X+c90GrzE+wAADCAI9oAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG2ABVdVr6yqr5hhuz+qqs3T/S+swlwvq6onjH4fgLVKaAMsvlcmedjQPhRVdTC/JPyyJEIbOGIJbYAFUlVfWVXvqaqPVNWNVXVhlmL2mqq6Ztrm4qraUVU3VdXPPMz+jquqP6mqf7qX595WVb9aVR9O8h+ralNVfaiqbqiqd1bVMdN2X7ZeVS9IsjnJ26cfqnjsiv+HAbDGCW2AxXJWkj/v7qd09zcn+U9J/jzJc7r7OdM2P9ndm5M8Ocl3V9WT97ajqjo+yXuS/HR3v2cf77chybd3979LcnmSV3X3k5N8NMmF0zZftt7d70iyI8n3d/em7v7rQ/trAyweoQ2wWD6a5HlV9fNV9Z3dfe9etnlhVf1pkj9L8qQkp+5lm0cmuTrJv+/u7ft5v9/p7geq6quTHN3d75/WL0vyXftaP4i/F8BhR2gDLJDu/l9Jnpal4P7Zqvrp5c9X1SlJfjzJc6cjzO9J8pi97Or+JNcnOXPZay+aTvPYuWy7v1rZvwHAkUNoAyyQ6SoeX+zu30zyC1mK7s8nedy0yeOzFMf3TqeGnL2PXXWSH0jyjVX1qiTp7p+cTvPY9GUbLx05v6eqvnNaekmS9+9rfbq/fC6AI87BfIscgPn5liS/UFV/l+Rvk/zbJM9M8gdV9efd/Zyq+rMkH0tye5IP7mtH0ykhL0qyrao+391veZj33pLkV6dLCd6W5OUPs/62af2vkzzTedrAkaa6e94zAADAYcepIwAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAP8PopXbX1vK0UoAAAAASUVORK5CYII=\n",
      "text/plain": [
       "<Figure size 864x504 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.figure(figsize=(12,7))\n",
    "\n",
    "sns.countplot(data=mushroom_data, x='stalk-root')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9c778994",
   "metadata": {},
   "source": [
    "Above is the feature that is missing values. We will remove all missing values. Since the missing values are of one category, we will drop it to avoid adding noise in the dataset. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bed27ebd",
   "metadata": {},
   "source": [
    "And finally, we can look in the class feature. There are two categories, `e(edible)` and `p(poisonous)`. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "id": "3c59437e",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "<AxesSubplot:xlabel='class', ylabel='count'>"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    },
    {
     "data": {
      "image/png": "iVBORw0KGgoAAAANSUhEUgAAAtoAAAGpCAYAAACzsJHBAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjQuMiwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy8rg+JYAAAACXBIWXMAAAsTAAALEwEAmpwYAAAaeUlEQVR4nO3df7BndX3f8dfbXdCkMQHkhpLdjUvMtgnkB5otkpg/LI6AtBWTUYs1cWuZbDLFNplJUyWTBqOhkzQaGm1kSgoK1oRQjXXr0JotmqSmEVjqivwI4y1q2R2EVVBjTWih7/5xz5pvcHe9C/dzv/euj8fMd/acz/mc7/dz/9l5zpnzPd/q7gAAACvrKfNeAAAAHIuENgAADCC0AQBgAKENAAADCG0AABhg47wXMMLJJ5/cW7dunfcyAAA4xt12222f7e6FQx07JkN769at2bNnz7yXAQDAMa6qPn24Y24dAQCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwwMZ5L+BY9gM/d928lwCsE7f92qvmvQQAVpgr2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhgeGhX1Yaq+mhVvX/aP62qbq6qxar63ao6fhp/6rS/OB3fOvMel07j91TVeaPXDAAAT9ZqXNH+6SR3z+z/apIruvs7kzyc5OJp/OIkD0/jV0zzUlWnJ7koyRlJzk/ytqrasArrBgCAJ2xoaFfV5iR/J8m/m/YryTlJ3j1NuTbJS6btC6f9TMdfMM2/MMn13f1Id38yyWKSs0auGwAAnqzRV7T/dZJ/nuT/TfvPSPL57n502t+XZNO0vSnJfUkyHf/CNP8r44c45yuqamdV7amqPQcOHFjhPwMAAI7OsNCuqr+b5MHuvm3UZ8zq7qu6e3t3b19YWFiNjwQAgMPaOPC9n5fkxVV1QZKnJfnmJL+R5ISq2jhdtd6cZP80f3+SLUn2VdXGJN+S5HMz4wfNngMAAGvSsCva3X1pd2/u7q1Z+jLjB7v7lUk+lOSl07QdSd43be+a9jMd/2B39zR+0fRUktOSbEtyy6h1AwDAShh5RftwXpvk+qr65SQfTXL1NH51kndW1WKSh7IU5+nuO6vqhiR3JXk0ySXd/djqLxsAAJZvVUK7u/8gyR9M2/fmEE8N6e6/SPKyw5x/eZLLx60QAABWll+GBACAAYQ2AAAMMI97tAHgsP7XG7533ksA1olv/8WPz3sJR+SKNgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwwLDQrqqnVdUtVfWxqrqzqn5pGn9HVX2yqvZOrzOn8aqqt1TVYlXdXlXPmXmvHVX1iem1Y9SaAQBgpWwc+N6PJDmnu79UVccl+XBV/efp2M9197sfN/9FSbZNr+cmuTLJc6vqpCSXJdmepJPcVlW7uvvhgWsHAIAnZdgV7V7ypWn3uOnVRzjlwiTXTed9JMkJVXVqkvOS7O7uh6a43p3k/FHrBgCAlTD0Hu2q2lBVe5M8mKVYvnk6dPl0e8gVVfXUaWxTkvtmTt83jR1u/PGftbOq9lTVngMHDqz0nwIAAEdlaGh392PdfWaSzUnOqqrvSXJpku9K8reSnJTktSv0WVd19/bu3r6wsLASbwkAAE/Yqjx1pLs/n+RDSc7v7vun20MeSfL2JGdN0/Yn2TJz2uZp7HDjAACwZo186shCVZ0wbX9Dkhcm+dPpvutUVSV5SZI7plN2JXnV9PSRs5N8obvvT/KBJOdW1YlVdWKSc6cxAABYs0Y+deTUJNdW1YYsBf0N3f3+qvpgVS0kqSR7k/zUNP/GJBckWUzy5SSvTpLufqiq3pjk1mneG7r7oYHrBgCAJ21YaHf37UmefYjxcw4zv5Nccphj1yS5ZkUXCAAAA/llSAAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwADDQruqnlZVt1TVx6rqzqr6pWn8tKq6uaoWq+p3q+r4afyp0/7idHzrzHtdOo3fU1XnjVozAACslJFXtB9Jck53f3+SM5OcX1VnJ/nVJFd093cmeTjJxdP8i5M8PI1fMc1LVZ2e5KIkZyQ5P8nbqmrDwHUDAMCTNiy0e8mXpt3jplcnOSfJu6fxa5O8ZNq+cNrPdPwFVVXT+PXd/Uh3fzLJYpKzRq0bAABWwtB7tKtqQ1XtTfJgkt1J/meSz3f3o9OUfUk2TdubktyXJNPxLyR5xuz4Ic6Z/aydVbWnqvYcOHBgwF8DAADLNzS0u/ux7j4zyeYsXYX+roGfdVV3b+/u7QsLC6M+BgAAlmVVnjrS3Z9P8qEkP5jkhKraOB3anGT/tL0/yZYkmY5/S5LPzY4f4hwAAFiTRj51ZKGqTpi2vyHJC5PcnaXgfuk0bUeS903bu6b9TMc/2N09jV80PZXktCTbktwyat0AALASNn7tKU/YqUmunZ4Q8pQkN3T3+6vqriTXV9UvJ/lokqun+VcneWdVLSZ5KEtPGkl331lVNyS5K8mjSS7p7scGrhsAAJ60YaHd3bcnefYhxu/NIZ4a0t1/keRlh3mvy5NcvtJrBACAUfwyJAAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYIBhoV1VW6rqQ1V1V1XdWVU/PY2/vqr2V9Xe6XXBzDmXVtViVd1TVefNjJ8/jS1W1etGrRkAAFbKxoHv/WiSn+3u/1FVT09yW1Xtno5d0d1vmp1cVacnuSjJGUm+Lcl/raq/MR3+zSQvTLIvya1Vtau77xq4dgAAeFKGhXZ335/k/mn7z6rq7iSbjnDKhUmu7+5HknyyqhaTnDUdW+zue5Okqq6f5gptAADWrFW5R7uqtiZ5dpKbp6HXVNXtVXVNVZ04jW1Kct/MafumscONP/4zdlbVnqrac+DAgZX+EwAA4KgMD+2q+qYk70nyM939xSRXJnlWkjOzdMX7zSvxOd19VXdv7+7tCwsLK/GWAADwhI28RztVdVyWIvtd3f17SdLdD8wc/60k75929yfZMnP65mksRxgHAIA1aeRTRyrJ1Unu7u5fnxk/dWbajyS5Y9releSiqnpqVZ2WZFuSW5LcmmRbVZ1WVcdn6QuTu0atGwAAVsLIK9rPS/LjST5eVXunsZ9P8oqqOjNJJ/lUkp9Mku6+s6puyNKXHB9Nckl3P5YkVfWaJB9IsiHJNd1958B1AwDAkzbyqSMfTlKHOHTjEc65PMnlhxi/8UjnAQDAWuOXIQEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGCAZYV2Vd20nDEAAGDJEX+CvaqeluQbk5xcVSfmL39S/ZuTbBq8NgAAWLeOGNpJfjLJzyT5tiS35S9D+4tJ/s24ZQEAwPp2xNDu7t9I8htV9U+6+62rtCYAAFj3vtYV7SRJd7+1qn4oydbZc7r7ukHrAgCAdW1ZoV1V70zyrCR7kzw2DXcSoQ0AAIewrNBOsj3J6d3dIxcDAADHiuU+R/uOJH995EIAAOBYstwr2icnuauqbknyyMHB7n7xkFUBAMA6t9zQfv3IRQAAwLFmuU8d+cPRCwEAgGPJcp868mdZespIkhyf5Lgk/7u7v3nUwgAAYD1b7hXtpx/crqpKcmGSs0ctCgAA1rvlPnXkK3rJf0xy3sovBwAAjg3LvXXkR2d2n5Kl52r/xZAVAQDAMWC5Tx35ezPbjyb5VJZuHwEAAA5hufdov3r0QgAA4FiyrHu0q2pzVb23qh6cXu+pqs2jFwcAAOvVcr8M+fYku5J82/T6T9MYAABwCMsN7YXufnt3Pzq93pFkYeC6AABgXVtuaH+uqn6sqjZMrx9L8rmRCwMAgPVsuaH9j5K8PMlnktyf5KVJ/uGgNQEAwLq33Mf7vSHJju5+OEmq6qQkb8pSgAMAAI+z3Cva33cwspOkux9K8uwxSwIAgPVvuaH9lKo68eDOdEV7uVfDAQDg685yQ/vNSf6kqt5YVW9M8t+T/KsjnVBVW6rqQ1V1V1XdWVU/PY2fVFW7q+oT078nTuNVVW+pqsWqur2qnjPzXjum+Z+oqh1P7E8FAIDVs6zQ7u7rkvxokgem14929zu/xmmPJvnZ7j49ydlJLqmq05O8LslN3b0tyU3TfpK8KMm26bUzyZXJV66eX5bkuUnOSnLZ7NV1AABYi5Z9+0d335XkrqOYf3+WnlCS7v6zqro7yaYkFyZ5/jTt2iR/kOS10/h13d1JPlJVJ1TVqdPc3dN94amq3UnOT/I7y10LAACstuXeOvKkVNXWLH158uYkp0wRniw9LvCUaXtTkvtmTts3jR1uHAAA1qzhoV1V35TkPUl+pru/OHtsunrdK/Q5O6tqT1XtOXDgwEq8JQAAPGFDQ7uqjstSZL+ru39vGn5guiUk078PTuP7k2yZOX3zNHa48b+iu6/q7u3dvX1hwa/DAwAwX8NCu6oqydVJ7u7uX585tCvJwSeH7EjyvpnxV01PHzk7yRemW0w+kOTcqjpx+hLkudMYAACsWSOfhf28JD+e5ONVtXca+/kkv5Lkhqq6OMmns/TT7klyY5ILkiwm+XKSVydLP44zPVLw1mneGw5+MRIAANaqYaHd3R9OUoc5/IJDzO8klxzmva5Jcs3KrQ4AAMZalaeOAADA1xuhDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGGBYaFfVNVX1YFXdMTP2+qraX1V7p9cFM8curarFqrqnqs6bGT9/GlusqteNWi8AAKykkVe035Hk/EOMX9HdZ06vG5Okqk5PclGSM6Zz3lZVG6pqQ5LfTPKiJKcnecU0FwAA1rSNo964u/+oqrYuc/qFSa7v7keSfLKqFpOcNR1b7O57k6Sqrp/m3rXS6wUAgJU0j3u0X1NVt0+3lpw4jW1Kct/MnH3T2OHGv0pV7ayqPVW158CBAyPWDQAAy7baoX1lkmclOTPJ/UnevFJv3N1Xdff27t6+sLCwUm8LAABPyLBbRw6lux84uF1Vv5Xk/dPu/iRbZqZunsZyhHEAAFizVvWKdlWdOrP7I0kOPpFkV5KLquqpVXVakm1Jbklya5JtVXVaVR2fpS9M7lrNNQMAwBMx7Ip2Vf1OkucnObmq9iW5LMnzq+rMJJ3kU0l+Mkm6+86quiFLX3J8NMkl3f3Y9D6vSfKBJBuSXNPdd45aMwAArJSRTx15xSGGrz7C/MuTXH6I8RuT3LiCSwMAgOH8MiQAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGCAYaFdVddU1YNVdcfM2ElVtbuqPjH9e+I0XlX1lqparKrbq+o5M+fsmOZ/oqp2jFovAACspJFXtN+R5PzHjb0uyU3dvS3JTdN+krwoybbptTPJlclSmCe5LMlzk5yV5LKDcQ4AAGvZsNDu7j9K8tDjhi9Mcu20fW2Sl8yMX9dLPpLkhKo6Ncl5SXZ390Pd/XCS3fnqeAcAgDVnte/RPqW775+2P5PklGl7U5L7Zubtm8YON/5VqmpnVe2pqj0HDhxY2VUDAMBRmtuXIbu7k/QKvt9V3b29u7cvLCys1NsCAMATstqh/cB0S0imfx+cxvcn2TIzb/M0drhxAABY01Y7tHclOfjkkB1J3jcz/qrp6SNnJ/nCdIvJB5KcW1UnTl+CPHcaAwCANW3jqDeuqt9J8vwkJ1fVviw9PeRXktxQVRcn+XSSl0/Tb0xyQZLFJF9O8uok6e6HquqNSW6d5r2hux//BUsAAFhzhoV2d7/iMIdecIi5neSSw7zPNUmuWcGlAQDAcH4ZEgAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMMBcQruqPlVVH6+qvVW1Zxo7qap2V9Unpn9PnMarqt5SVYtVdXtVPWceawYAgKMxzyvaf7u7z+zu7dP+65Lc1N3bktw07SfJi5Jsm147k1y56isFAICjtJZuHbkwybXT9rVJXjIzfl0v+UiSE6rq1DmsDwAAlm1eod1Jfr+qbquqndPYKd19/7T9mSSnTNubktw3c+6+aeyvqKqdVbWnqvYcOHBg1LoBAGBZNs7pc3+4u/dX1bcm2V1Vfzp7sLu7qvpo3rC7r0pyVZJs3779qM4FAICVNpcr2t29f/r3wSTvTXJWkgcO3hIy/fvgNH1/ki0zp2+exgAAYM1a9dCuqr9WVU8/uJ3k3CR3JNmVZMc0bUeS903bu5K8anr6yNlJvjBziwkAAKxJ87h15JQk762qg5//2939X6rq1iQ3VNXFST6d5OXT/BuTXJBkMcmXk7x69ZcMAABHZ9VDu7vvTfL9hxj/XJIXHGK8k1yyCksDAIAVs5Ye7wcAAMcMoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADCG0AABhAaAMAwABCGwAABhDaAAAwgNAGAIABhDYAAAwgtAEAYAChDQAAAwhtAAAYQGgDAMAAQhsAAAYQ2gAAMIDQBgCAAYQ2AAAMILQBAGAAoQ0AAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADCC0AQBgAKENAAADrJvQrqrzq+qeqlqsqtfNez0AAHAk6yK0q2pDkt9M8qIkpyd5RVWdPt9VAQDA4a2L0E5yVpLF7r63u/9PkuuTXDjnNQEAwGFtnPcClmlTkvtm9vclee7shKramWTntPulqrpnldYGR+vkJJ+d9yJYW+pNO+a9BFjr/N/JV7us5r2CJHnm4Q6sl9D+mrr7qiRXzXsd8LVU1Z7u3j7vdQCsJ/7vZD1aL7eO7E+yZWZ/8zQGAABr0noJ7VuTbKuq06rq+CQXJdk15zUBAMBhrYtbR7r70ap6TZIPJNmQ5JruvnPOy4Inyi1OAEfP/52sO9Xd814DAAAcc9bLrSMAALCuCG0AABhAaAMAwABCGwAABhDasEqq6seq6paq2ltV/7aqNsx7TQBrWVVtrao/rap3VdXdVfXuqvrGea8Llktowyqoqu9O8veTPK+7z0zyWJJXznVRAOvD30zytu7+7iRfTPKP57weWDahDavjBUl+IMmtVbV32v+Oua4IYH24r7v/eNr+90l+eJ6LgaOxLn6wBo4BleTa7r503gsBWGce/4MffgCEdcMVbVgdNyV5aVV9a5JU1UlV9cw5rwlgPfj2qvrBafsfJPnwPBcDR0Nowyro7ruS/EKS36+q25PsTnLqfFcFsC7ck+SSqro7yYlJrpzzemDZ/AQ7ALAmVdXWJO/v7u+Z91rgiXBFGwAABnBFGwAABnBFGwAABhDaAAAwgNAGAIABhDbA15Gqen1V/bN5rwPg64HQBgCAAYQ2wDGsql5VVbdX1ceq6p2PO/YTVXXrdOw9VfWN0/jLquqOafyPprEzquqWqto7vd+2efw9AOuJx/sBHKOq6owk703yQ9392ao6Kck/TfKl7n5TVT2juz83zf3lJA9091ur6uNJzu/u/VV1Qnd/vqremuQj3f2uqjo+yYbu/vN5/W0A64Er2gDHrnOS/Ifu/mySdPdDjzv+PVX136awfmWSM6bxP07yjqr6iSQbprE/SfLzVfXaJM8U2QBfm9AG+Pr1jiSv6e7vTfJLSZ6WJN39U0l+IcmWJLdNV75/O8mLk/x5khur6pz5LBlg/RDaAMeuDyZ5WVU9I0mmW0dmPT3J/VV1XJauaGea96zuvrm7fzHJgSRbquo7ktzb3W9J8r4k37cqfwHAOrZx3gsAYIzuvrOqLk/yh1X1WJKPJvnUzJR/keTmLMX0zVkK7yT5tenLjpXkpiQfS/LaJD9eVf83yWeS/MtV+SMA1jFfhgQAgAHcOgIAAAMIbQAAGEBoAwDAAEIbAAAGENoAADCA0AYAgAGENgAADPD/ASdJuH3n8RjJAAAAAElFTkSuQmCC\n",
      "text/plain": [
       "<Figure size 864x504 with 1 Axes>"
      ]
     },
     "metadata": {
      "needs_background": "light"
     },
     "output_type": "display_data"
    }
   ],
   "source": [
    "plt.figure(figsize=(12,7))\n",
    "\n",
    "sns.countplot(data=mushroom_data, x='class')"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e6733727",
   "metadata": {},
   "source": [
    "<a name='4'></a>\n",
    "\n",
    "## 4 - Data Preprocessing \n"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f6d484c5",
   "metadata": {},
   "source": [
    "Let's remove the missing values first. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "id": "277d9129",
   "metadata": {},
   "outputs": [],
   "source": [
    "mushroom_df = mushroom_data.dropna()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e9906083",
   "metadata": {},
   "source": [
    "For the purpose of performing clustering, we will remove the labels. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "id": "98d74144",
   "metadata": {},
   "outputs": [],
   "source": [
    "mushroom = mushroom_df.drop('class', axis=1)\n",
    "mushroom_labels = mushroom_df['class']"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "1786056c",
   "metadata": {},
   "source": [
    "Let's now convert all categorical features into the numerics."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "id": "b7ee6d06",
   "metadata": {},
   "outputs": [],
   "source": [
    "from sklearn.preprocessing import OrdinalEncoder\n",
    "\n",
    "encoder = OrdinalEncoder()\n",
    "\n",
    "mushroom_prepared = encoder.fit_transform(mushroom)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "id": "805385b4",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[5., 2., 4., ..., 1., 3., 5.],\n",
       "       [5., 2., 7., ..., 2., 2., 1.],\n",
       "       [0., 2., 6., ..., 2., 2., 3.],\n",
       "       ...,\n",
       "       [5., 3., 3., ..., 5., 5., 4.],\n",
       "       [5., 3., 1., ..., 5., 1., 0.],\n",
       "       [2., 3., 1., ..., 5., 1., 0.]])"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mushroom_prepared"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b85b688e",
   "metadata": {},
   "source": [
    "As you can see above, `mushroom_prepared` is a NumPy array. We can convert it back to the Pandas Dataframe although KMeans algorithm can accept both as input. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "id": "8ea82660",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>cap-shape</th>\n",
       "      <th>cap-surface</th>\n",
       "      <th>cap-color</th>\n",
       "      <th>bruises%3F</th>\n",
       "      <th>odor</th>\n",
       "      <th>gill-attachment</th>\n",
       "      <th>gill-spacing</th>\n",
       "      <th>gill-size</th>\n",
       "      <th>gill-color</th>\n",
       "      <th>stalk-shape</th>\n",
       "      <th>...</th>\n",
       "      <th>stalk-surface-below-ring</th>\n",
       "      <th>stalk-color-above-ring</th>\n",
       "      <th>stalk-color-below-ring</th>\n",
       "      <th>veil-type</th>\n",
       "      <th>veil-color</th>\n",
       "      <th>ring-number</th>\n",
       "      <th>ring-type</th>\n",
       "      <th>spore-print-color</th>\n",
       "      <th>population</th>\n",
       "      <th>habitat</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>5.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>4.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>2.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>5.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>5.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>7.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>2.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>0.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>2.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>3.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>5.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>6.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>...</td>\n",
       "      <td>2.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>5.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>4</th>\n",
       "      <td>5.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>3.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>...</td>\n",
       "      <td>2.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>5.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>2.0</td>\n",
       "      <td>0.0</td>\n",
       "      <td>1.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>5 rows × 22 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "   cap-shape  cap-surface  cap-color  bruises%3F  odor  gill-attachment  \\\n",
       "0        5.0          2.0        4.0         1.0   6.0              1.0   \n",
       "1        5.0          2.0        7.0         1.0   0.0              1.0   \n",
       "2        0.0          2.0        6.0         1.0   3.0              1.0   \n",
       "3        5.0          3.0        6.0         1.0   6.0              1.0   \n",
       "4        5.0          2.0        3.0         0.0   5.0              1.0   \n",
       "\n",
       "   gill-spacing  gill-size  gill-color  stalk-shape  ...  \\\n",
       "0           0.0        1.0         2.0          0.0  ...   \n",
       "1           0.0        0.0         2.0          0.0  ...   \n",
       "2           0.0        0.0         3.0          0.0  ...   \n",
       "3           0.0        1.0         3.0          0.0  ...   \n",
       "4           1.0        0.0         2.0          1.0  ...   \n",
       "\n",
       "   stalk-surface-below-ring  stalk-color-above-ring  stalk-color-below-ring  \\\n",
       "0                       2.0                     5.0                     5.0   \n",
       "1                       2.0                     5.0                     5.0   \n",
       "2                       2.0                     5.0                     5.0   \n",
       "3                       2.0                     5.0                     5.0   \n",
       "4                       2.0                     5.0                     5.0   \n",
       "\n",
       "   veil-type  veil-color  ring-number  ring-type  spore-print-color  \\\n",
       "0        0.0         0.0          1.0        3.0                1.0   \n",
       "1        0.0         0.0          1.0        3.0                2.0   \n",
       "2        0.0         0.0          1.0        3.0                2.0   \n",
       "3        0.0         0.0          1.0        3.0                1.0   \n",
       "4        0.0         0.0          1.0        0.0                2.0   \n",
       "\n",
       "   population  habitat  \n",
       "0         3.0      5.0  \n",
       "1         2.0      1.0  \n",
       "2         2.0      3.0  \n",
       "3         3.0      5.0  \n",
       "4         0.0      1.0  \n",
       "\n",
       "[5 rows x 22 columns]"
      ]
     },
     "execution_count": 24,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mushroom_prep_df = pd.DataFrame(mushroom_prepared, columns=mushroom.columns)\n",
    "mushroom_prep_df.head()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "bda280c6",
   "metadata": {},
   "source": [
    "No alphabets anymore. They were perfectly encoded or converted to numerics representation. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d134d334",
   "metadata": {},
   "source": [
    "We are now ready to find the labels with KMeans Clustering. Again, this is for the assumption that we do not have labels, or to make it simple, we have a data about the characteristics of different plants, but we do not know if they are edible or not. We want to use unsupervised learning to figure that out. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dd496de6",
   "metadata": {},
   "source": [
    "<a name='5'></a>\n",
    "\n",
    "## 5 - Training K-Means Clustering to Find Clusters"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "e28b943f",
   "metadata": {},
   "source": [
    "We are going to create a KMeans model from `sklearn.cluster`. We will remember to provide the number of the clusters, which is 2 in our case. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "id": "d46bdcef",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "KMeans(n_clusters=2)"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sklearn.cluster import KMeans\n",
    "\n",
    "k_clust = KMeans(n_clusters=2, random_state=42)\n",
    "\n",
    "k_clust.fit(mushroom_prep_df)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "d8d0f2a7",
   "metadata": {},
   "source": [
    "We can access the cluster centers by `model.cluster_centers_`. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "id": "23a25b60",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[3.42103549e+00, 1.56756757e+00, 4.99153370e+00, 2.70270270e-01,\n",
       "        2.78508629e+00, 1.00000000e+00, 2.78736568e-01, 1.09736242e-01,\n",
       "        2.04265711e+00, 3.07391729e-01, 8.42722240e-01, 1.27841094e+00,\n",
       "        1.31813741e+00, 3.87463367e+00, 3.87463367e+00, 0.00000000e+00,\n",
       "        9.97465999e-18, 1.00651254e+00, 1.42526864e+00, 8.16020840e-01,\n",
       "        3.34255943e+00, 1.72549658e+00],\n",
       "       [3.41935484e+00, 1.69840653e+00, 3.41507967e+00, 9.14885348e-01,\n",
       "        4.49553051e+00, 9.93004275e-01, 6.52934318e-02, 1.42635056e-01,\n",
       "        5.32024874e+00, 7.52429071e-01, 3.04702682e-01, 1.92382433e+00,\n",
       "        1.97901283e+00, 4.03925379e+00, 4.00194326e+00, 0.00000000e+00,\n",
       "        3.10921104e-03, 1.02487369e+00, 2.89739604e+00, 1.69218811e+00,\n",
       "        4.15507190e+00, 6.51768364e-01]])"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "k_clust.cluster_centers_"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a81504c9",
   "metadata": {},
   "source": [
    "Also, we can get the labels that the KMeans provided for each data point. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "id": "1c7dd9fa",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0, 0, 0, ..., 1, 1, 1], dtype=int32)"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "k_labels = k_clust.labels_\n",
    "k_labels"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b78a88c4",
   "metadata": {},
   "source": [
    "<a name='6'></a>\n",
    "\n",
    "### 6 -Evaluating K-Means Clustering"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8c0cc81c",
   "metadata": {},
   "source": [
    "In real world, evaluating the performance of KMeans is not an easy thing, because there are not true labels to compare with the clustered labels. In our case since we have them, we can find things like accuracy score, or even find the confusion matrix to display the actual and predicted classes. Not to mention classification report to find things like Recall, Precision, or F1 Score. \n",
    "\n",
    "But again since we are merely comparing the labels(true and clustered), we do not need that extra metrics. \n",
    "\n",
    "Before finding the accuracy score, I will first convert the true labels into the numbers or encode them. For simplicity, I will use a map function. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "id": "9c48b744",
   "metadata": {},
   "outputs": [],
   "source": [
    "map_dict = {\n",
    "    \n",
    "    'p':0,\n",
    "    'e':1\n",
    "}\n",
    "\n",
    "mushroom_labels_prep = mushroom_labels.map(map_dict)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "id": "c9acf918",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0       0\n",
       "1       1\n",
       "2       1\n",
       "3       0\n",
       "4       1\n",
       "       ..\n",
       "7986    1\n",
       "8001    1\n",
       "8038    1\n",
       "8095    0\n",
       "8114    0\n",
       "Name: class, Length: 5644, dtype: category\n",
       "Categories (2, int64): [1, 0]"
      ]
     },
     "execution_count": 32,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "mushroom_labels_prep"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "id": "312bdbb1",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.673458540042523"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "from sklearn.metrics import accuracy_score\n",
    "\n",
    "accuracy_score(mushroom_labels_prep, k_labels)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "5f708da8",
   "metadata": {},
   "source": [
    "This is not excellent, but it's so impressive. Why? Well, KMeans never saw the labels, it was only feed the data of different characteristics of poisonous and edible mushrooms and its job was to try to find patterns in the data so as to learn if a given mushroom specy is a poisonous or edible. \n",
    "\n",
    "\n",
    "KMeans algorithm is very useful in areas where you have a bunch of unlabeled data. Take an example in customer segmentation. You may want to provide different promotions to some groups of your customers but you have no clue of who would benefit from that particular promotion. So, you can try to find the group of customers using this algorithm. It will try to group similar customers according to their interests, and will likely appreciate the promotion.\n",
    "\n",
    "The same concept can be applied to grouping the equipments that has similar defects in an industry. That was just mentioning few, there are more applications of KMeans clustering. "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c1b8979e",
   "metadata": {},
   "source": [
    "<a name='7'></a>\n",
    "\n",
    "### 7 - Final Notes\n",
    "\n",
    "In this notebook, we learned the idea behind unsupervised learning and KMeans clustering. We also practiced that on mushroom dataset where we were interested in grouping the species that can be poisonous or edible. \n",
    "\n",
    "If you like mushrooms and you know some of their characteristics, no doubt that you enjoyed this notebook. Maybe pick one edible sample and make it your next meal :)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "3d2be035",
   "metadata": {},
   "source": [
    "## [BACK TO TOP](#0)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "d7840e9d",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3.7.10 64-bit ('tensor': conda)",
   "language": "python",
   "name": "python3710jvsc74a57bd034ac5db714c5906ee087fcf6e2d00ee4febf096586592b6ba3662ed3b7e7a5f6"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
